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I. TECHNICAL FIELD 

The invention relates to methods of increasing 
the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/instability 
sequences (INS) in the coding region of the gene which 
prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
genes mutated in accordance with these methods and host 
cells containing these constructs* 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA without changing its protein coding capacity. 
These methods are useful for allowing or increasing the 
expression of genes which would otherwise not be expressed 
or which would be poorly expressed because of the presence 
of INS regions in the mRNA transcript. Thus, the methods, 
constructs and host cells of the invention are useful for 
increasing the amount of protein produced by any gene 
which encodes an mRNA transcript which contains an INS. 

The methods, constructs and host cells of the 
invention are useful for increasing the amount of protein 
produced from genes such as those coding for growth 
factors, interferons, interleukins, the fos proto- oncogene 
protein, and HIV-1 gag and env, for example. 

The' invention also relates to using the 
constructs of the invention irf immunotherapy and 
immunoprophylaxis, e.g., as a vaccine, or in genetic 
therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral or other 



eyess.cn vectors cr they may also be directly injected 
into tissue cells resulting i„ efficient expreLon of ^ 
encoded protein or protein fragment. These^st^ts Z 

e T bVhT T r M ia ^ t£a °™ "place-en 

••B.. by homologous recombination with a target gene in- 



e.g 

situ. 



The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or define the boundaries of 

> us n in g bi t h °eI e /inStabllity SE9UenCeS in »«**. of 

using these constructs, and host cells containing these 

constructs, once the INS regions of the mKNAs nave b e e n 
located and/or further defined, the nucleotide seven's 
encoding these INS regions can be mutated in accordant 
«th the method of this invention to allow the IZtT ■ 
stability and/or utilization of the ™* ana therefo" " 
allow an increase in the amount of protein produced fr™ 
expression vectors encoding the mutated mK»f. 

11 • BACKGROTTND APT 

While much work has been devoted to studvi„„ 

^XTl re9Ulat0ry meChanlSmS ' " - 
increasingly clear that post-transcriptional processes 

fr : a m ; d : t and utiii2ati ° n ° f - ~ 

include nuclear oostT ^"'"-"iPtional processes 
«,n • Sar P°^ c -"anscriptional processes (e o 

BFUcing. polyadenylation, and transport) as well Is' 
cytoplasmic kna degradation. All these processes 
contribute to the final steady-state level oTT • 
tra nscript. These points of regulation « * ' ~ 

aidant than a ZT^' " ^ 

transcribed and efHH^i 9 uy 

of synthesis .^"rS** 106 '" - - e " iClenC r «* 

ensures that the message reach** t-v,« 

cytoplasm and is translated, but the' rapid tte of 
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° degradation guarantees that the mRNA does not accumulate 
to too high a level. Many RNAs, for example the mRNAS for 
proto- oncogenes c- myc and c- fos , have been studied* which 
exhibit this kind of regulation in that they are expressed 
at very low levels, decay rapidly and are modulated 

5 quickly and transiently under different conditions. See, 
M. Hentze, Biochim. Biophys, Acta l&a0:281-292 (1991) for 
a review. The rate of degradation of many of these mRNAs 
has been shown to be a function of the presence of one or 
more instability/inhibitory sequences within the mRNA 
10 itself. 

Some cellular genes which encode unstable or 
short-lived mRNAs have been shown to contain A and U-rich 
(AU-rich) INS within the 3' untranslated region (3' UTR) 
of the transcript mRNA. These cellular genes include the 

15 genes encoding granulocyte -monocyte colony stimulating 

factor (GM-CSF), whose AU-rich 3'UTR sequences (containing 
8 copies of the sequence motif AUUUA) are more highly 
conserved between mice and humans than the protein 
encoding sequences themselves (93% versus 65%) (G. Shaw, 

20 and R. Kamen, Cell 46:659-667 (1986)) and the mvc proto- 

oncogene (c-myc), whose untranslated regions are conserved 
throughout evolution (for example, 81% for man and mouse) 
(M. Gole and S.E. Mango, Enzyme 44:167-180 (1990)). Other 
unstable or short-lived mRNAs which have been shown to 

25 contain AU-rich sequences within the 3' UTR include 

interferons (alpha, beta and gamma IFNs) ; interleukins 
(IL1, IL2 and IL3); tumor necrosis factor (TNF) ; 
lymphotoxin (Lym) ; IgGl induction factor (IgG IF) ; 
granulocyte colony stimulating factor (G-CSF) , myb proto- 

30 oncogene (c- myb ) ; and sis proto- oncogene (c- sis ) (G. Shaw, 
and R. Kamen, Cell 4£:659-667 (1986)). See also, R. 
Wisdom and W. Lee, Gen. & Devel. £:232-243 (1991) (c-myc) ; 
A. Shyu et al., Gen. & Devel. £:221-231 (1991) (c-fos) ; T. 
Wilson and R. Treisman, Nature 33f>:396-399 (1988) (c-£qs) ; 

35 T . Jones and M. Cole, Mol. Cell Biol. 7:4513-4521 (1987) 



° fc- myc ) ; V, Kruys et al . , Proc. Natl, Acad. Sci. USA. 
££:673-677 (1992) (TNF) ; D. Koeller et al-, Proc. Natl. 
Acad. Sci. USA. 8£:7778-7782 (1991) (transferrin receptor 
(TfR) and c-fos); I. Laird-Of f ringa et al.. Nucleic Acids 
Res. l£:2387-2394 (1991) (c-mi^c) ; D. Wreschner and G. 

5 Rechavi, Eur. J. Biochem. 172:333-340 (1988) (which 
contains a survey of genes -and relative stabilities); 
Bunnell et al., Somatic Cell and Mol. Genet. 16:151-162 

(1990) (galactosyltransf erase- associated protein (GTA) , 
which contains an AU-rich 3' UTR with regions that are 98% 

10 similar among humans, mice and rats); and Caput et al. 

Proc. Natl. Acad. Sci. 83:1670-1674 (1986) (TNF, which 

contains a 33 nt AU-rich sequence conserved in toto in the 

murine and human TNF mRNAs) . 

Some of these cellular genes which have been 
15 shown to contain INS within the 3' UTR of their mRNA have 

also been shown to contain INS within the coding region. 

See , e.g., R. Wisdom, and W. Lee, Gen. & Devel . 5:232-243 

(1991) (c-mi^); A. Shyu et al . , Gen. & Devel. 5:221-231 
(1991) (c-fos) . 

20 Like the cellular mRNAs, a number of HIV-1 mRNAs 

have also been shown to contain INS within the protein 
coding regions, which in some cases coincide with areas of 
high AU- content. For example, a 218 nucleotide region 
with high AU content (61.5%) present in the HIV-1 gag 

25 coding sequence and located at the 5' end of the gag gene 
has been implicated in the inhibition of gag expression. 
S. Schwartz et al.-, J. Virol. 66:150-159 (1992). Further 
experiments have indicated the presence of more than one 
INS in the gag -protease gene region of the viral genome 

30 ( S ee below) . Regions of high AU content have been found 
in the HIV-1 gag/pol and env INS regions. The AUUUA 
sequence is not present ,in the gag coding sequence, but it 
is present in many copies within gag/pol and env coding 
regions. S. Schwartz et al., J. Virol. 66:150-159 (1992). 

35 See also , e.g., M. Emerman, Cell 57:1155-1165 (1989) (env 
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° gene contains both 3' UTR and internal 

inhibitory/instability sequences); C. Rosen, Proc. Natl. 
Acad. Sci., USA 8^:2071-2075 (1988) (env); M. 
Hadzopoulou-Cladaras et al., J. Virol. 63 : 1265-1274 (1989) 
(env); F. Maldarelli et al., J. Virol. £5:5732-5743 (1991) 
5 (gag/pol) ; A. Cochrane et al. f J. Virol. 65:5303-5313 

(1991) (pol) . F. Maldarelli et al., supra . note that the 
direct analysis of the function of INS regions in the 
context of a replication- competent , full-length HIV-1 
provirus is complicated by the fact that the intragenic 

10 ins are located in the coding sequences of virion 

structural proteins. They further note that changes in 
these intragenic INS sequences would in most cases affect 
protein sequences as well, which in turn could affect the 
replication of such mutants, 

15 The INS regions are not necessarily AU-rich. 

For example, the c-fos coding region INS is structurally 
unrelated to the AU-rich 3' UTR INS (A. Shyu et al., Gen. 
& Devel. 5:221-231 (1991), and some parts of the env 
coding region, which appear to contain INS elements, are 

20 not AU-rich. Furthermore, some stable transcripts also 
carry the AUUUA motif in their 3' UTRs, implying either 
that this sequence alone is not sufficient to destabilize 
a transcript, or that these messages also contain a 
dominant stabilizing element (M. Cole and S.E. Mango, 

25 Enzyme 44:167-180 .(1990)). Interestingly, elements unique 
to specific mRNAs have also been found which can stabilize 
a mRNA transcript. One example is the Rev responsive 
element, which in the presence of Rev protein promotes the 
transport, stability and utilization of a mRNA transcript 

30 (b„ Felber et al., Proc. Natl. Acad. Sci. USA £6:1495-1499 
(1989)). 

It is not yet known whether the AU sequences 
themselves, and specifically the Shaw-Kamen sequence, 
AUUUA, act as part or all of the degradation signal. Nor 
35 i s it clear whether this is the only mechanism employed 
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° for short-lived messages, or if there are different 

classes of RNAs, each with its own degradative system. 
See . M. Cole and S.E. Mango, Enzyme £4:167-180 (1990) for 
a review; see also , T. Jones and M. Cole, Mol. Cell. 
Biol. 7:4513-4521 (1987). Mutation of the only copy of 
5 the AUUUA sequence in the c-myc RNA INS region has no 

effect on RNA turnover, therefore the inhibitory sequence 
may be quite different from that of GM-CSF (M. Cole and 
S.E. Mango, Enzyme 44:167-180 (1990)), or else the mRNA 
instability may be due to the presence of additional INS 

10 regions within the mRNA. 

Previous workers have made mutations in genes 
encoding AU-rich inhibitory/ instability sequences within 
the 3' UTR of their transcript mRNAs • For example, G. 
Shaw and R. Kamen, Cell 46:659-667 (1986), introduced a 51 

15 nucleotide AT-rich sequence from GM-CSF into the 3' UTR of 
the rabbit £-globin gene. This insertion caused the 
otherwise stable £-globin mRNA to become highly unstable 
in vivo , resulting in a dramatic decrease in expression of 
/?-globin as compared to the wild- type control. The 

20 introduction of another sequence of the same length, but 
with 14 G's and C's interspersed among the sequence, into 
the same site of the 3' UTR of the rabbit 0-globin gene 
resulted in accumulation levels which were similar to that 
of wild- type 0-globin mRNA. This control sequence did not 

25 contain the motif AUUUA, which occurs seven times in the 

AU-rich sequence. The results suggested that the presence 
of the AU-rich sequence in the jS-globin mRNA specifically 
confers instability. 

A. Shyu et al. # Gen. & Devel. 1:221-231 (1991), 

30 studied the AU-rich INS in the 3' UTR of c-fos by 

disrupting all three AUUUA pentanucleotides by single U- 
to-A point mutations to preserve -the AU- richness of the 
element while altering its sequence. This change in the 
sequence of the 3' UTR INS dramatically inhibited the 

35 ability of the mutated 3' UTR to destabilize the £-globin 
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message when inserted into the 3' UTR of a 0-globin mRNA 
as compared to the wild- type INS . The c- fos protein- 
coding region INS (which is structurally unrelated "to the 
3' UTR INS) was studied by inserting it in- frame into the 
coding region of a 0-globin and observing the effect of 
5 deletions on the stability of the heterologous c-fos-/S- 
globin mRNA. 

Previous workers have also made mutations in 
genes encoding inhibitory/instability sequences within the 
coding region of their transcript mRNAs . For example, P. 
10 Carter-Muenchau and R. Wolf, Proc. Natl. Acad. Sci., USA, 
£6:1138-1142 (1989) demonstrated the presence of a 
negative control region that lies deep in the coding 
sequence of the EL_ coli 6-phosphogluconate dehydrogenase 
(gnd) gene. The boundaries of the element were defined by 
15 the cloning of a synthetic "internal complementary 

sequence" (ICS) and observing the effect of this internal 
complementary element on gene expression when placed at 
several sites within the gnd gene. The effect of single 
and double mutations introduced into the synthetic ICS 
20 element by site -directed mutagenesis on regulation of 

expression of a gnd-lacZ fusion gene correlated with the 
ability of the respective mRNAs to fold into secondary 
structures that sequester the ribosome binding site. 
Thus, the gnd gene's internal regulatory element appears 
25 to function as a cis -acting antisense RNA. 

M. Lundigran et al., Proc. Natl. Acad. Sci. USA 
81:1479-1483 (1991), conducted an experiment to identify 
sequences linked to btuB that are important for its proper 
expression and transcriptional regulation in which a DNA 
fragment carrying the region from -60 to +253 (the coding 
region starts at +241) was mutagenized and then fused in 
frame to lacZ. Expression of jS-galactosidase from variant 
plasmids containing a single base change were then 
analyzed. The mutations were all 6*C to A*T transitions, 
as expected from the mutagenesis procedures used. Among 



30 
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other mutations, a single base substitution at +253 
resulted in greatly increased expression of the btuB-lacZ 
gene fusion under both repressing and nonrepressing 
conditions . 

R. Wisdom and W. Lee, Gen. & Devel. 5:232-243 
(1991) , conducted an experiment which showed that mRNA 
derived from a hybrid full JLength c- myc gene, which 
contains a mutation in the translation initiation codon 
from ATG to ATC, is relatively stable, implying that the 
c- myc coding region inhibitory sequence functions in a 
translation dependent manner. 

R. Parker and A. Jacobson, Proc. Natl. Acad. 
Sci. USA 82:2780-2784 (1990) demonstrated that a region of 
42 nucleotides found in the coding region of Saccharomvces 
cerevisiae MATarl mRNA, which normally confers low 
stability, can be experimentally inactivated by 
introduction of a translation stop codon immediately 
upstream of this 42 nucleotide segment. The experiments 
suggest that the decay of MATofl mRNA is promoted by the 
translocation of ribosomes through a specific region of 
the coding sequence. This 42 nucleotide segment has a 
high content (8 out of 14) of rare codons (where a rare 
codon is defined by its occurrence fewer than 13 times per 
1000 yeast codons (citing S. Aota et al., Nucl. Acids. 
Res. 16:r315-r402 (1988))) that may induce slowing of 
translation elongation. The authors of the study, R. 
Parker and A. Jacobson, state that the concentration of 
rare codons in the sequences required for rapid decay, 
coupled with the prevalence of rare codons in unstable 
yeast mRNAs and the known ability of rare codons to induce 
translational pausing, suggests a model in which mRNA 
structural changes may be affected by the particular 
positioning of a paused .ribosome. Another author stated 
that it would be revealing to find out whether (and how) a 
kinetic change in translation elongation could affect mRNA 
stability (M. Hentze, Bioch. Biophys. Acta 1090 :281-292 



# 
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* (1991)). R. Parker and A. Jacobson, note, however, that 
the stable PGK1 mRNA can be altered to include up to 40% 
rare codons with, at most, a 3- fold effect on steady- state 
mRNA level and that this difference may actually be due to 
a change in transcription rates. Thus, these authors 
5 conclude, it seems unlikely that ribosome pausing E££ 
is sufficient to promote rapid mRNA decay. 

None of the aforementioned references describe 
or suggest the present invention of locating 
inhibitory/instability sequences within the coding region 
10 of an mRNA and modifying the gene encoding that mRNA to 
remove these inhibitory/instability sequences by making 
multiple nucleotide substitutions without altering the 
coding capacity of the gene. 

15 iji. DISCLOSURE OF THE INVENTION 

The invention relates to methods of increasing 
the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/instability 
sequences (INS) in the coding region of the gene which 

20 prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
genes mutated in accordance with these methods and host 
cells containing these constructs. 

As defined herein, an inhibitory/instability 

25 sequence of a transcript is a regulatory sequence that 
resides within an mRNA transcript and is either (1) 
responsible for rapid turnover of that mRNA and can 
destabilize a second indicator/reporter mRNA when fused to 
that indicator/reporter mRNA, or is (2) responsible for 

30 underutilization of a mRNA and can cause decreased protein 
production from a second indicator/reporter mRNA when 
fused to that second indicator/ reporter mRNA or (3) both 
of the above. The inhibitory/instability sequence of a 
gene is the gene sequence that encodes an 

35 inhibitory/ instability sequence of a transcript. As used 
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herein, utilization refers to the overall efficiency of 
translation of an mRNA. 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA without changing its protein coding capacity. 
5 However, alternative embodiments of the invention in which 
the inhibitory/instability sequence is mutated in such a 
way that the amino acid sequence of the encoded protein is 
changed to include conservative or non- conservative amino 
acid substitutions, while still retaining the function of 

1° the originally encoded protein, are also envisioned as 
part of the invention. 

These methods are useful for allowing or 
increasing the expression of genes which would otherwise 
not be expressed or which would be poorly expressed 

15 because of the presence of INS regions in the mRNA 

transcript. The invention provides methods of increasing 
the production of a protein encoded by a gene which 
encodes an mRNA containing an inhibitory/instability 
region by altering the portion of the nucleotide sequence 

20 of any gene encoding the inhibitory/ instability region. 

The methods, constructs and host cells of the 
invention are useful for increasing the amount of protein 
produced by any gene which encodes an mRNA transcript 
which contains an INS, Examples of such genes include, 

25 for example, those coding for growth factors, interferons, 
interleukins, and the fos proto- oncogene protein, as well 
as the genes coding for HIV-i gag and env proteins* 

The method of the invention is exemplified by 
the mutational inactivation of an INS within the coding 

3° region of the HIV-l gag gene which results in increased 
gag expression, and by constructs useful for Rev- 
independent gag expression in human cells . This 
mutational inactivation of the inhibitory/ instability 
sequences involves introducing multiple point mutations 

35 into the AU-rich inhibitory sequences within the coding 
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° region of the gag gene which, due to the degeneracy of 

nucleotide coding sequences, do not affect the amino acid 
sequence of the gag protein. 

The constructs of the invention are exemplified 
by vectors containing the gag env, and pol genes which 
5 have been mutated in accordance with the methods of this 
invention and the host cell^ are exemplified by human 
HLtat cells containing these vectors. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 

10 immunoprophylaxis, e.g., as a vaccine, or in genetic 

therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral vectors or 
other expression vectors or they may also be directly 
injected into tissue cells resulting in efficient 

15 expression of the encoded protein or protein fragment. 

These constructs may also be used for in -vivo or in-vitro 
gene replacement, e.g., by homologous recombination with a 
target gene in- situ. 

The invention also relates to certain 

20 exemplified constructs which can be used to simply and 
rapidly detect and/or further define the boundaries of 
inhibitory/instability sequences in any mRNA which is 
known or suspected to contain such regions, whether the 
INS are within the coding region or in the 3'UTR or both. 

25 Once the INS regions of the genes have been located and/or 
further defined through the use of these vectors, the same 
vectors can be used in mutagenesis experiments to 
eliminate the identified INS without affecting the coding 
capacity of the gene, thereby allowing an increase in the 

30 amount of protein produced from expression vectors 
containing these mutated genes. The invention also 
relates to methods of using these constructs and to host 
cell& containing these constructs. 

The constructs of the invention which can be 

35 used to detect instability/inhibitory regions within an 
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mRNA are exemplified by the vectors, pl9, pl7M1234, 
p37M1234 and p37Ml-10D, which are set forth in Fig. 1. (B) 
and Fig. 6, p37M1234 and p37Ml-10D are the preferred 
constructs, due to the existence of a commercially 
available ELISA test which allows the simple and rapid 
detection of any changes in the amount of expression of 
the gag indicator/ reporter -protein. However, any 
constructs which contain the elements depicted between the 
long terminal repeats in the afore-mentioned constructs of 
Fig. l. (B) and Fig. 6, and which can be used to detect 
instability/inhibitory regions within a mRNA, are also 
envisioned as part of this invention. 

The existence of inhibitory/instability 
sequences has been known in the art, but no solution to 
the problem which allowed increased expression of the 
genes encoding the mRNAs containing these sequences within 
coding regions by making multiple nucleotide 
substitutions, without altering the coding capacity of the 
gene, has heretofore been disclosed. 

IV. BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1. (A) Structure of the HIV-1 gsnome. Boxes indicate 
the different viral genes. (B) Structure of the gag 
expression plasmids ( see infra ) . Plasmid pl7 contains the 
complete HIV-l 5' LTR and sequences up to the BssHII 
restriction site at nucleotide (nt) 257. (The nucleotide 
numbering refers to the revised nucleotide sequence of the 
HIV-l molecular clone pHXB2 (G. Myers et al., Eds. Human 
retroviruses and AIDS, A compilation and analysis nf 
nucleic acid and amino a cid seg ugnrps (Los Alamos National 
Laboratory, Los Alamos, New Mexico, 1991), incorporated 
herein by reference) . This sequence is followed by the 
pl7** coding sequence spanning nt 336-731 (represented as 
an open box) immediately followed by a translational stop 
codon and a linker sequence. Adjacent to the linker is 
the HIV-l 3' LTR from nt 8561 to the last nucleotide of 
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the U5 region. Plasmid pl7R contains in addition the 330 
nt Styl fragment encompassing the RRE (L. Solomin et al., 
J Virol 64:6010-6017 (1990)) (represented as a stippled % 
box) 3' to the pl7 g,e coding sequence. The RRE is followed 
by HIV-1 sequences from nt 8021 to the last nucleotide of 
the U5 region of the 3' LTR. Plasmids pl9 and pl9R were 
generated by replacing the HIV-1 pl7* H coding sequence in 
plasmids pl7 and pl7R, respectively, with the RSV pl9 gafi 
coding sequence (represented as a black box) . Plasmid 
p!7M1234 is identical to pl7, except for the presence of 
2 8 silent nucleotide substitutions within the gag coding 
region, indicated by XXX. Wavy lines represent plasmid 
sequences. Plasmid pl7M1234 (731- 1424) and plasmid 
p37M1234 are described immediately below and in the 
description. These vectors are illustrative of constructs 
which can be used to determine whether a particular 
nucleotide sequence encodes an INS. In this instance, 
vector pl7M1234, which contains an indicator gene (here, 
pl7 ga£ ) represents the control vector and vectors 
pl7M1234 (731-1424) and p37M1234 represent vectors in which 
the nucleotide sequence of interest (here the p24 gag coding 
region) is inserted into the vector either 3' to the stop 
codon of the indicator gene or is fused in frame to the 
coding region of the indicator gene, respectively. (C) 
Construction of expression vectors for identification of 
gag INS and for further mutagenesis. pl7M1234 was used as 
a vector to insert additional HIV-1 gag sequences 
downstream from the coding region of the altered pl7 e ** 
gene. Three different fragments indicated by nucleotide 
numbers were inserted into vector pl7M1234 as described 
below. To generate plasmids pl7M1234 (731-1081) , 
pl7M1234 (731-1424) and pl7M1234 (731-2165) , the indicated 
fragments were inserted ,3' to the stop codon of the pl7 g48 
coding sequence in pl7M1234. In expression assays (data 
not shown), p!7M1234 (731-1081) and pl7M1234 (731-1424) 
expressed high levels of pl7*** protein. In contrast, 



. 14 - 

P17M1234 (731-2165) did not express pl7 w protein, 
indicating the presence of additional INS within the HIV-1 
gag coding region. To generate plasmids p!7M1234 (731- 
1081) NS, p37Ml234 and p55M1234, the stop codon at the end 
of the altered pl.7 w gene and all linker sequences in 
5 pi7M1234 were eliminated by oligonucleotide-directed 

mutagenesis and the resulting plasmids restored the gag 
open reading frame as in HIV-1. In expression assays 
(data not shown) p37M1234 expressed high levels of protein 
as determined by western blotting and ELISA assays whereas 

10 p55Ml234 did not express any detectable gag protein. 
Thus, the addition of sequences 3' to the p24 region 
resulted in the elimination of protein expression, 
indicating that nucleotide sequence 1424-2165 contains an 
INS. This experiment demonstrated that p37M1234 is an 

15 appropriate vector to analyze additional INS . 

Fig. 2. Gag expression from the different vectors. (A) 

HLtat cells were transfected with plasmid pl7, pl7R, or 
p!7M1234 in the absence (-) or presence (+) of Rev (see 

20 infra ) . The transfected cells were analyzed by 

immunoblotting using a human HIV-1 patient serum. (B) 
Plasmid pl9 or pl9R was transfected into HLtat cells in 
the absence (-) or presence (+) of Rev. The transfected 
cells were analyzed by immunoblotting using rabbit and 

25 anti-RSV pl9 gl€ serum. HIV or RSV proteins served as 

markers in the same gels. The positions of pl7 ea * and pl9 sa « 
are indicated at right. 

Pig. 3. znRNA analysis on northern blots. (A) HLtat cells 
30 were transfected with the indicated plasmids in the 

absence (-) or presence ( + ) of Rev. 20 fig of total RNA 
prepared from the transfected cells were analyzed ( see 
infra ) ♦ (B) RNA production from plasmid pl9 or pl9R was 
similarly analyzed in the absence (-) or presence (+) of 
35 Rev. 
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c Fig, 4. Nucleotide sequence of the HIV-1 pl7 8a * region. 

The locations of the 4 oligonucleotides (M1-M4) used to 
generate all mutants are underlined. The silent 
nucleotide substitutions introduced by each mutagenesis, 
oligonucleotide are indicated below the coding sequence. 
5 Numbering starts from nt +1 of the viral mRNA. 

Fig. 5. Gag expression by different mutants. HLtat cells 
were transfected with the various plasmids indicated at 
the top of the figure. Plasmid pl7R was transfected in 
10 the absence (-) or presence (+) of Rev, while the other 
plasmids were analyzed in the absence of Rev. pl7 8-e 
production was assayed by immunoblotting as described in 
Fig. 2. 

15 Fig. 6. Expression vectors used in the identification 
and elimination of additional INS elements in the gag 
region. The gag and pol region nucleotides included in 
each vector are indicated by lines. The position of some 
gag and pol oligonucleotides is indicated at the top of 

20 the figure, as are the coding regions for pl7 £te , p24 w , 
pl5 eae , protease and pGS^ 1 proteins. Vector p37M1234 was 
further mutagenized using different combinations of 
oligonucleotides. One obtained mutant gave high levels of 
p24 after expression. It was analyzed by sequencing and 

25 found to contain four mutant oligonucleotides M6gag, 
M7gag, M8gag and MIOgag. Other mutants containing 
different combinations of oligos did not show an increase 
in expression, or only partial increase in expression. 
p55BMl-10 and p55AMl-10 were derived from p37Ml-lOD. 

30 p55Ml-13P0 contains additional mutations in the gag and 
pol regions included in the oligonucleotides Mllgag, 
M12gag, M13gag and MOpol. The hatched boxes indicate the 
location of the mutant oligonucleotides; the hatched boxes 
containing circles indicate mutated regions containing 

35 ATTTA sequences, which may contribute to instability 
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* and/or inhibition of the mRNA; and the open boxes 

containing triangles indicate mutated regions containing 
AATAAA sequences, which may contribute to instability 
and/or inhibition of the mRNA* Typical levels of p24 EB * 
expression in human cells after transf ections as described 

5 supra are shown at the right (in pg/ml) . 

Fig. 7. Eukaryotic expression plasmids used to study env - 
expression. The different expression plasmids are derived 
from pNL15E (Schwartz, et al. J. Virol. 64:5448-5456 

10 (1990) . The generation of the different constructs is 
described in the text- The numbering follows the 
corrected HXB2 sequence (Myers et al., 1991, supra; Ratner 
et al., Hamatol. Bluttransfus . 31:404-406 (1987); Ratner 
et al., AIDS Res. Hum. Retroviruses 3:57-69 (1987); 

15 Solomin, et al. J. Virol. 64:6010-6017 (1990), starting 
with the first nucleotide of R as +1. 5'SS, 5' splice 
site; 3'SS, 3' splice site. 

Fig. 8. Env expression is Rev dependent in the absence of 
20 functional splice sites. Plasmids plSESD- and plSEDSS (C) 
were transfected in the absence or presence of a rev 
expression plasmid (pL3crev) into HLtat cells. One day 
later, the cells were harvested for analyses of RNA and 
protein. Total RNA was extracted and analyzed on Northern 
25 blots (B) . The blots were hybridized with a 

nick- translated probe spanning XhoI-SacI (nt 8443 to 9118) 
of HXB2. Protein production was measured by western blots 
to detect cell -associated Env using a mixture of HIV-1 
patient sera and rabbit anti-gp!20 antibody (A) . 

30 

Fig. 9. Env production from the gpl20 expression plasmids. 
The indicated plasmids were transfected into HLtat cells 
in duplicate plates. A rev expression plasmid (pL3srev) 
was cotransf ected as indicated. One day later, the cells 
35 were harvested for analyses of RNA and protein. Total RNA 
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was extracted and analyzed on Northern blots (A) . The 
blots were hybridized using a nick- translated probe 
spanning nt 6158 to 7924. Protein production (B) was 
measured by immunoprecipitation after labeling for 5 h . 
with 200 mCi/ml of 35 S-cysteine to detect secreted 
5 processed Env (gpl20) . • 

Fig. 10. The identification of INS elements within gpl20 
and gp41 using the pl9 (RSV gag) test system. Schematic 
structure of exon 5E containing the env ORF. Different 

10 fragments (A to G) of the gp41 portion and fragment H of 

the vpu/gpl20 portion were PCR amplified and inserted into 
the unique EcoRI site located downstream of the RSV gag 
gene in pl9. The location of the sequences included in 
the amplified fragments is indicated to the right using 

15 HXB2R numbering system. Fragments A and B are amplified 
from pNLlSE and pNLlSEDSS (in which the splice acceptor 
sites 7A, 7B and 7 have been deleted) respectively, using 
the same oligonucleotide primers. They are 276 and 234 
nucleotides long, respectively- Fragment C was amplified 

20 from pNLlSEDSS as a 323 nucleotide fragment. Fragment F 
is a Hpal-Kpnl restriction fragment of 362 nucleotides. 
Fragment E was amplified as a 668 nucleotide fragment from 
pNLlSEDSS, therefore the major splice donor at nucleotide 
5592 of HXB2 has been deleted. The rest of the fragments 

25 were amplified from pNLlSE as indicated in the figure. 

HLtat cells were transfected with these constructs. One 
day later, the cells were harvested and pl9gag production 
was determined by Western blot analysis using the 
anti-RSVGag antibody. The expression of Gag from these 

3° plasmids was compared to Gag production of pl9. SA, splice 
acceptor; B, BamHI; H, Hpal; X, Xhol; K, KpnI. The down 
regulatory effect of INS contained within the different 
fragments is indicated at right. 

35 Fig. 11. The identification of INS elements within gpl20 



- 18 - 

and gp41 using the p37Ml-10D (mutant INS p37*"* expression 
system) test system. Schematic structure of the env ORF. 

Different fragments (1 to 7) of env were PCR amplified as 
indicated in the figure and inserted into the polylinker 
located downstream of the p37 mutant gag gene in 
p37Ml-10D. Fragments 1 to 6 were amplified from the 
molecular clone pLW2.4, a gift of Dr. M. Reitz, which is 
very similar to HXB2R. Clone pLW2.4 was derived from an 
individual infected by the same HIV-1 strain IIIB, from 
which the HXB2R molecular clone has been derived. 
Fragment 7 was cloned from pNL43. For consistency and 
clarity, the numbering follows the HXB2R system. HLtat 
cells were transfected with these constructs. One day 
later, the cells were harvested and p24 gag production was 
determined by antigen capture assay. The expression of 
Gag from these plasmids was compared to Gag production of 
p37Ml-10D. The down regulatory effect of each fragment is 
indicated at right. 

Fig. 12. Elimination of the negative effects of CRS in 
the pol region. Nucleotides 3700-4194 of HIV-l were 
inserted in vector p37M1234 as indicated. This resulted 
in the inhibition of gag expression. Using mutant 
oligonucleotides M9pol-M12pol (P9-P12) , several mutated 
CRS clones were isolated and characterized. One of them, 
p37Ml234RCRSP10+P12p contains the mutations indicated in 
Fig. 13. This clone produced high levels of gag. 
Therefore, the combination of mutations in 
p37M1234RCRSP10+P!2p eliminated the INS, while mutations 
only in the region of P10 or of P12 did not eliminate the 
INS. 

Fig. 13. Point mutations eliminating the negative effects 
of CRS in the pol region (nucleotides 3700-4194) . The 

combination of mutations able to completely inactivate the 
inhibitory/instability element within the CRS region of 



10 
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HIV-1 pol (nucleotides 3700-4194) is shown under the 
sequence in small letters. These mutations are contained 
within oligonucleotides MIOpol and M12pol (see Table 2) . 
M12pol oligonucleotide contains additional mutations that 
were not introduced into p37M1234RCRSP10+P12p (see Fig. 
12) # as determined by DNA sequencing. 

Fig. 14. Plasmid map and nucleotide sequence of the 
efficient gag expression vector p37Ml-10D. (A) Plasmid 
map of vector p37Ml-10D. The plasmid contains a 
pBluescriptKS ( - ) backbone, human genomic sequences 
flanking the HIV-1 sequences as found in pNL43 genomic 
clone, HIV-l LTRs and the p37 8a * region (pl7 and p24) . The 
pl7 region has been mutagenized using oligonucleotides Ml 
to M4, and the p24 region has been mutagenized using 
15 oligonucleotides M6, M7 , M8 and M10, as described in the 
test. The coding region for p37 is flanked by the 5' and 
3 HIV-1 LTRs, which provide promoter and polyadenylation 
signals, as indicated by the arrows. Three consecutive 
arrows indicate the U5, R, and U3 regions of the LTR, 
respectively. The transcribed portions of the LTRs are 
shown in black. The translational stop codon inserted at 
the end of the p24 coding region is indicated at position 
1818. Some restriction endonuclease cleavage sites are 
also indicated. (B-D) Complete nucleotide sequence of 
25 p37Ml-10D. The amino acid sequence of the p37** protein 
is shown under the coding region. Symbols are as above. 
Numbering starts at the first nucleotide of the 5' LTR. 
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V. MODES FOR CARRYING OUT THE INVENTION 

<It is to be understood that both the foregoing 
general description and the following detailed description 
are exemplary and explanatory only, and are not 
restrictive of the invention, as claimed. The 
accompanying drawings, which are incorporated in and 
constitute a part of the specification, illustrate an 
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embodiment of the invention and, together with the 
description, serve to explain the principles of the 
invention. 

The invention comprises methods for eliminating 
intragenic inhibitory/instability regions of an mRNA by 
5 (a) identifying the intragenic inhibitory/instability 
regions, and (b) mutating the intragenic 
inhibitory/instability regions by making multiple point 
mutations. These mutations may be clustered. This method 
does not require the identification of the exact location 

10 or knowledge of the mechanism of function of the INS. 
Nonetheless, the results set forth herein allow the 
conclxision that multiple regions within mRNAs participate 
in determining stability and utilization and that many of 
these elements act at the level of RNA transport, 

15 turnover, and/or localization. Generally, the mutations 
are such that the amino acid sequence encoded by the mRNA 
is unchanged, although conservative and non- conservative 
amino acid substitutions are also envisioned as part of 
the invention where the protein encoded by the mutated 

20 gene is substantially similar to the protein encoded by 
the non-mutated gene. 

The nucleotides to be altered can be chosen 
randomly, the only requirement being that the amino acid 
sequence encoded by the protein remain unchanged; or, if 

25 conservative and non- conservative amino acid substitutions 
are to be made, the only "requirement is that the protein 
encoded by the mutated gene be substantially similar to 
the protein encoded by the non-mutated gene. 

If the INS region is AT rich or GC rich, it is 

30 preferable that it be altered so that it has a content of 
about 50% G and C and about 50% A and T. If the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. If 
desired, however (e.g., to make an A and T rich region 

35 more G and C rich) , more -preferred codons can be altered 
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to less -preferred codons. If the INS region contains 
conserved nucleotides, some of those conserved nucleotides 
could be altered to non- conserved nucleotides.. Again, the 
only requirement is that the amino acid sequence encoded 
by the protein remain unchanged; or, if conservative and 
5 non- conservative amino acid substitutions are to be made, 
the only requirement is that the protein encoded by the 
mutated gene be substantially similar to the protein 
encoded by the non-mutated gene* 

As used herein, conserved nucleotides means 

10 evolutionarily conserved nucleotides for a given gene, 

since this conservation may reflect the fact that they are 
part of a signal involved in the inhibitory/instability 
determination. Conserved nucleotides can generally be 
determined from published references about the gene of 

15 interest or can be determined by using a variety of 

computer programs available to practitioners of the art. 

Less -preferred and more-preferred codons for 
* various organisms can be determined from codon usage 
charts, such as those set forth in T. Maruyama et al., 

20 Nucl. Acids Res. 14:rl51-rl97 (1986) and in S. Aota et 

al,, Nucl. Acids. Res. I6:r315-r402 (1988), or through use 
of a computer program, such as that disclosed in U.S. 
Patent No. 5,082,767 entitled "Codon Pair Utilization", 
issued to G. W. Hatfield et al. on January 21, 1992, which 

25 i s incorporated herein by reference. 

Generally, the method of the invention is 
carried out as follows: 

1. Identification of an mRNA containing an INS 
30 - The rate at which a particular protein is made 

is usually proportional to the cytoplasmic level of the 
mRNA which encodes it. Thus, a candidate for an mRNA 
containing an inhibitory/instability sequence is one whose 
mRNA or protein is either not detectably expressed or is 
35 expressed poorly as compared to the level of expression of 
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° a reference mRNA or protein under the control of the same 
or similar strength promoter. Differences in the steady 
state levels of a particular mRNA (as determined, for 
example, by Northern blotting) , when compared to the 
steady state level of mRNA from another gene under the 
5 control of the same or similar strength promoter, which 

cannot be accounted for by changes in the apparent rate of 
transcription (as determined, for example, by nuclear run- 
on assays) indicate that the gene is a candidate for an 
unstable mRNA. In addition or as an alternative to being 

10 unstable, cytoplasmic mRNAs may be poorly utilized due to 
various inhibitory mechanisms acting in the cytoplasm. 
These effects may be mediated by specific mRNA sequences 
which are named herein as "inhibitory sequences". 
Candidate mRNAs containing 

15 inhibitory/instability regions include mRNAs from genes 
whose expression is tightly regulated, e.g., many 
oncogenes, growth factor genes and genes for biological 
response modifiers such as interleukins . Many of these 
genes are expressed at very low levels, decay rapidly and 

20 are modulated quickly and transiently under different 

conditions. The negative regulation of expression at the 
level of mRNA stability and utilization has been 
documented in several cases and has been proposed to be 
occurring in many other cases. Examples of genes for 

25 which there is evidence for post- transcriptional 

regulation due to the presence of inhibitory/instability 
regions in the mRNA include the cellular genes encoding 
granulocyte -monocyte colony stimulating factor (GM-CSF) , 
proto- oncogenes c- myc . c- myb , c- sis , c-fos; interferons 

30 _ (alpha, beta and gamma IFNs) ; interleukins (IL1, IL2 and 

IL3) ; tumor necrosis factor (TNF) ; lymphotoxin (Lym) ; IgGl 
induction factor (IgG IF) ; granulocyte colony stimulating 
factor (G-CSF) ; transferrin receptor (TfR) ; and 
galactosyltransf erase-associated protein (GTA) ; HIV-l 

35 genes encoding env, gag and pol; the EL_ coli genes for 6- 
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° phosphogluconate dehydrogenase (grid) and btuB; and the 
yeast gene for MATarl ( see the discussion in the 
"Background Art" section, above) . The genes encoding the 
cellular proto- oncogenes c- myc and c-fos, as well as the 
yeast gene for MATal and the HIV-1 genes for gag, env and 
5 pol are genes for which there is evidence for 

inhibitory/instability regions within the coding region in 
addition to evidence for inhibitory/instability regions 
within the non- coding region. Genes encoding or suspected 
of encoding mRNAs containing inhibitory/ instability 

10 regions within the coding region are particularly relevant 
to the invention. 

After identifying a candidate unstable or poorly 
utilized mRNA, the in vivo half -life (or stability) of 
that mRNA can be studied by conducting pulse -chase 

15 experiments (i.e., labeling newly synthesized RNAs with a 
radioactive precursor and monitoring the decay of the 
radiolabeled mRNA in the absence of label); or by 
introducing in vitro transcribed mRNA into target cells 
(either by microinjection, calcium phosphate co- 

20 precipitation, electroporation, or other methods known in 
the art) to monitor the in vivo half-life of the defined 
mRNA population; or by expressing the mRNA under study 
from a promoter which can be induced and which shuts off 
transcription soon after induction, and estimating the 

25 half -life of the mRNA which was synthesized during this 

short transcriptional burst; or by blocking transcription 
pharmacologically (e.g., with Actinomycin D) and following 
the decay of the particular mRNA at various time points 
after the addition of the drug by Northern blotting or RNA 

30 - protection (e.g. SI nuclease) assays. Methods for all the 
above determinations are well established. See , e.g., 
M.W. Hentze et al., Biochim. Biophys. Acta 1090:281-292 
(1991) and references cited therein. See also . 
S. Schwartz et al., J. Virol. 66:150-159 (1992). The most 

35 useful measurement is how much protein is produced, 



- 24 - 

because this includes all possible INS mechanisms. 
Examples of various mRNAs which have been shown to contain 
or which are suspected to contain INS regions are 
described above. Some of these mRNAs have been shown to 
have half -lives of less than 30 minutes when their mRNA 
levels are measured by Northern blots ( see , e.g., D. 
Wreschner and G. Rechavi, Eur. J. Biochem. 172:333-340 
(1988)). 

2 . Localization of Instability Determinants 
When an unstable or poorly utilized mRNA has 
been identified, the next step is to search for the 
responsible ( cis -acting) RNA sequence elements. Detailed 
methods for localizing the cis-acting 

inhibitory/instability regions are set forth in each of 
the references described in the "Background Art" section, 
above, and are also discussed infra . The exemplified 
constructs of the present invention can also be used to 
localize INS (see below) . Cis acting sequences 
responsible for specific mRNA turnover can be identified 
by deletion and point mutagenesis as well as by the 
occasional identification of naturally occurring mutants 
with an altered mRNA stability. 

In short, to evaluate whether putative 
regulatory sequences are sufficient to confer mRNA 
stability control, DNA sequences coding for the suspected 
INS regions are fused to an indicator (or reporter) gene 
to create a gene coding for a hybrid mRNA. The DNA 
sequences fused to the indicator (or reporter) gene can be 
cDNA, genomic DNA or synthesized DNA. Examples of 
indicator (or reporter) genes that are described in the 
references set forth in the "Background Art" section 
include the genes for neomycin, 0-galactosidase, 
chloramphenicol a'ctetyltransf erase (CAT) , and lucif erase, 
as well as the genes for /?-globin, PGK1 and ACT1. See 
also Sambrook et al . , Molecular Cloning. A Laboratory 



Manual . 2d. ed. Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, (1989), pp. 16.56-16.67. Other genes 
which can be used as indicator genes are disclosed herein 
(i.e., the gag gene of the Rous Sarcoma Virus (which lacks 
an inhibitory/instability region) and the Rev independent 
HIV-1 gag genes of constructs pl7M1234, p37M1234 and 
p37Ml-10D, which have been mutated to inactivate the 
inhibitory/instability region and which constitute one 
aspect of the invention. In general, virtually any gene 
encoding a mRNA which is stable or which is expressed at 
relatively high levels (defined here as being stable 
enough or expressed at high enough level so that any 
decrease in the level of the mRNA or expressed protein can 
be detected by standard methods) can be used as an 
indicator or reporter gene, although the constructs 
p37M1234 and p37Ml-10D, which are exemplified herein, are 
preferred for reasons set forth below. Preferred methods 
of creating hybrid genes using these constructs and 
testing the expression of mRNA and protein from these 
constructs are also set forth below. 

In general, the stability and/or utilization of 
the mRNAs generated by the indicator gene and the hybrid 
genes consisting of the indicator gene fused to the 
sequences suspected of encoding an INS region are tested 
by transfecting the hybrid genes into host cells which are 
appropriate for the expression vector used to clone and 
express the mRNAs." The resulting levels of mRNA are 
determined by standard methods of determining mRNA 
stability, e.g. Northern blots, SI mapping or PCR methods, 
and the resulting levels of protein produced are 
quantitated by protein measuring assays, such as ELISA, 
immunoprecipitation and/or western blots. The 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. Note that if the 
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° ultimate goal is to increase production of the encoded 

protein, the identification of the INS is most preferably 
carried out in the same host cell as will be used for the 
production of the protein. 

Examples of some of the host cells that have 
5 been used to detect INS sequences include somatic 

mammalian cells. Xenoous oocytes, yeast and E . coli. gee, 
e.g., G. Shaw and R. Kamen, Cell 46:659-667 (1986) 
(discussed supra ) which localized instability sequences in 
GM-CSF by inserting putative inhibitory sequences into the 

10 3' utr of the /?-globin gene, causing the otherwise stable 
/?-globin mRNA to become unstable when transfected into 
mouse or human cells. See also I. Laird-Of f ringa et al., 
Nucleic Acids Res. 12:2387-2394 (1991) which localized 
inhibitory/ instability sequences in c-myc using hybrid c- 

15 my c -neomycin resistance genes introduced into rat 

fibroblasts, and M. Lundigran et al., Proc. Natl. Acad. 
Sci. USA 88.:1479-1483 (1991) which localized 
inhibitory/instability sequences in btuB gene by using 
hybrid btuB-lacZ genes introduced into EL. coli . For 

20 examples of reported localization of specific 

inhibitory/instability sequences within a transcript of 
HIV-1 by destabilization of an otherwise long-lived 
indicator transcript, see , e.g., M. Emerman, Cell 57:1155- 
1165 (1989) (replaced 3' UTR of env gene with part of HBV 

25 and introduced into COS-1 cells); S. Schwartz et al., J. 
Virol. £6:150-159 (1992) (gag gene fusions with Rev 
independent tat reporter gene introduced into HeLa cells) ; 
F. Maldarelli et al., J. Virol. 65:5732-5743 (1991) 
(gag/pol gene fusions with Rev independent tat reporter 

30 . gene or chloramphenicol acetyltransf erase (CAT) gene 

introduced into HeLa and SW480 cells); and A. Cochrane et 
al., J. Virol. ££: 5303 -5313 (1991) (pol gene fusions with 
CAT gene or rat proinsulin gene introduced into COS-1 and 
CHO cells) . 

35 it is anticipated that in vitro mRNA degradation 



systems (e.g., crude cytoplasmic extracts) to assay mRNA 
turnover in vitro will complement ongoing in vivo analyses 
and help to circumvent some of the limitations of the in 
vivo systems. See M.W. Hentze et al., Biochim. Biophys. 
Acta 1090 :281*292 (1991) and references cited therein. 
See also D.. Wreschner and G. Rechavi, Eur. J. Biochem. 
122:333-340 (1988), which analyzed exogenous mRNA 
stability in a reticulocyte lysate cell -free system. 

In the method of the invention, the whole gene 
of interest may be fused to an indicator or reporter gene 
and tested for its effect on the resulting hybrid mRNA in 
order to determine whether that gene contains an 
inhibitory/instability region or regions. To further 
localize the INS within the gene of interest, fragments of 
the gene of interest may be prepared by sequentially 
deleting sequences from the gene of interest from either 
the 5' or 3' ends or both. The gene of interest may also 
be separated into overlapping fragments by methods known 
in the art (e.g., with restriction endonucleases , etc.) 
See , e.g., S. Schwartz et al., J. Virol. 6£: 150-159 
(1992) . Preferably, the gene is separated into 
overlapping fragments about 300 to 2000 nucleotides in 
length. Two types of vector constructs can be made. To 
permit the detection of inhibitory/instability regions 
that do not need to be translated in order to function, 
vectors can be constructed in which the gene of interest 
(or its fragments or suspected INS) can be inserted into 
the 3 ' UTR downstream from* the stop codon of an indicator 
or reporter gene. This does not permit translation 
through the INS. To test the possibility that some 
inhibitory/instability sequences may act only after 
translation of the mRNA, vectors can be constructed in 
which the gene of interest (or its fragments or suspected 
INS) is inserted into the coding region of the 
indicator/reporter gene. This method will permit the 
detection of inhibitory/instability regions that do need 
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to be translated in order to function. The hybrid 
constructs are transfected into host cells, and the 
resulting mRNA levels are determined by standard methods 
of determining mRNA stability, e.g. Northern blots, SI 
mapping or PCR methods, as set forth above and as 
described in most of the references cited in the 
"Background Art" section. See also . Sambrook et al. 
(1989), supra , for experimental methods. The protein 
produced from such genes is also easily quantitated by 
existing assays, such as ELISAS, immunoprecipitation and 
western blots, which are also described in most of the 
references cited in the "Background Art" section. See 
also . Sambrook et al. (1989), supra . for experimental 
methods. The hybrid DNAs containing the 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. The use of 
various fragments of the gene permits the identification 
of multiple independently functional 

inhibitory/instability regions, if any, while the use of 
overlapping fragments lessen the possibility that an 
inhibitory/instability region will not be identified as a 
result of its being cut in half, for example. 

The exemplified test vectors set forth in Fig. 
1. (B) and Fig. 6 and described herein, e.g., vectors 
pl7M1234, p37M1234, P37M1-10D and pl9, can be used to 
assay for the presence and' location of INS in various 
RNAs, including INS which are located within coding 
regions. These vectors can also be used to determine 
whether a gene of interest not yet characterized has INS 
which are candidates for mutagenesis curing. These 
vectors have a particular advantage over the prior art in 
that the same vectors can be used in the mutagenesis step 
of the invention (described below) in which the identified 
INS is eliminated without affecting the coding capacity of 
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the gene* 

The method of using these vectors involves 
introducing the entire gene, entire cDNA or fragments of 
the gene ranging from approximately 300 nucleotides to 
approximately 2 kilobases 3' to the coding region for gag 
protein using unique restriction sites which are 
engineered into the vectors. The expression of the gag 
gene in HLtat cells is measured at both the RNA and 
protein levels, and compared to the expression of the 
starting vectors. A decrease in expression indicates the 
presence of INS candidates that may be cured by 
mutagenesis. The method of using the vectors exemplified 
in Fig. 1 herein involves introducing the entire gene and 
fragments of the gene of interest into vectors pl7M1234, 
p37M1234 and pl9 . The size of the fragments are 
preferably 300-2000 nucleotides long. Plasmid DNA is 
prepared in E. coli and purified by the CsCl method. 

To permit detection of inhibitory/instability 
regions which do not need to be translated in order to 
function, the entire gene and fragments of the gene of 
interest are introduced into vectors p!7M1234, p37M1234 or 
pl9 3' of the stop codon of the pl7 ga8 coding region. To 
allow the detection of inhibitory/instability regions that 
affect expression only when translated, the described 
vectors can be manipulated so that the coding region of 
the entire gene or fragments of the gene of interest are 
fused in frame to" the expressed gag protein gene. For 
example, a fragment containing all or part of the coding 
region of the gene of interest can be inserted exactly 3' 
to the termination codon of the gag coding sequence in 
vector p3 7M1234 and the termination codon of gag and the 
linker sequences can be removed by oligonucleotide 
mutagenesis in such a way as to fuse the gag reading frame 
to the reading frame of the gerie of interest. 

RNA and protein production from the two 
expression vectors (e.g. p37M1234 containing the. fragment 



of the gene of interest inserted directly 3' of the stop 
codon of the gag coding region, with the gag termination 
codon intact, and p3 7M1234 containing the fragment of the 
gene of interest inserted in frame with the gag coding 
region, with the gag termination codon deleted) are then 
compared after transfection of purified DNA into HLtat 
cells. 

The expression of these vectors after 
transfection into human cells is monitored at both the 
level of RNA and protein production. RNA levels are 
guantitated by, e.g., Northern blots, SI mapping or PCR 
methods. Protein levels are guantitated by, e.g., western 
blot or ELISA methods. p37M1234 and p37Ml-10D are ideal 
for quantitative analysis because a fast non- radioactive 
ELISA protocol can be used to detect gag protein (DUPONT 
or COULTER gag antigen capture assay) . A decrease in the 
level of expression of the gag antigen indicates the 
presence of inhibitory/instability regions within the 
cloned gene or fragment of the gene of interest. 

After the inhibitory/instability regions have 
been identified, the vectors containing the appropriate 
INS fragments can be used to prepare single -stranded DNA 
and then used in mutagenesis experiments with specific 
chemically synthesized oligonucleotides in the clustered 
mutagenesis protocol described below. 

3. Mutation of the Inhibitory/Instability 
Regions'* to Generate Stable mRNAs 

Once the inhibitory/instability sequences are 
located within the coding region of an mRNA, the gene is 
modified to remove these inhibitory/instability sequences 
without altering the coding capacity of the gene. 
Alternatively, the gene is modified to remove the 
inhibitory/instability sequences, simultaneously altering 
the coding capacity of the gene to encode either 
conservative or non- conservative amino acid substitutions. 
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* 

° In the method of the invention, the most general 

method of eliminating the INS in the coding region of the 
gene of interest is by making multiple mutations in the 
INS region of the gene or gene fragments, without changing 
the amino acid sequence of the protein encoded by the 

5 gene; or, if conservative and non- conservative amino acid 
substitutions are to be made, the only requirement is that 
the protein encoded by the mutated gene be substantially 
similar to the protein encoded by the non-mutated gene. 
It is preferred that all of the suspected 
10 inhibitory/instability regions, if more than one, be 
mutated at once. Later, if desired, each 

inhibitory/instability region can be mutated separately in 
order to determine the smallest region of the gene that 
needs to be mutated in order to generate a stable mRNA. 

15 The ability to mutagenize long DNA regions at the same 
time can decrease the time and effort needed to produce 
the desired stable and/or highly expressed mRNA and 
resulting protein. The altered gene or gene fragments 
containing these mutations will then be tested in the 

20 usual manner, as described above, e.g., by fusing the 

altered gene or gene fragment with a reporter or indicator 
gene and analyzing the level of mRNA and protein produced 
by the altered genes after transfection into an 
appropriate host cell. If the level of mRNA and protein 

25 produced by the hybrid gene containing the altered gene or 
gene fragment is about* the same as that produced by the 
control construct encoding * only the indicator gene, then 
the inhibitory/instability regions have been effectively 
eliminated from the gene or gene fragment due to the 

30 alterations made in the INS. 

In the method of the invention, more than two 
point mutations will be made in the INS region. 
Optionally, point mutations may be made in at least about 
10% of the nucleotides in the inhibitory/instability 

35 region. These point mutations may also be clustered. The 
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° nucleotides to be altered can be chosen randomly (i.e., 

not chosen because of AT or GC content or the presence or 
absence of rare or preferred codons) , the only requirement 
being that the amino acid sequence encoded by the protein 
remain unchanged; or, if conservative and non- conservative 
5 amino acid substitutions are to be made, the only 

requirement is that the protein encoded by the mutated 
gene be substantially similar to the protein encoded by 
the non-mutated gene. 

In the method of the present invention, the gene 

10 sequence can be mutated so that the encoded protein 

remains the same due to the fact that the genetic code is 
degenerate, i.e., many of the amino acids may be encoded 
by more than one codon. The base code for serine, for 
example, is six- way degenerate such that the codons TCT, 

15 TCG, TCC, TCA, AGT, and AGC all code for serine. 

Similarly, threonine is encoded by any one of codons ACT, 
ACA, ACC and ACG. Thus, a plurality of different DNA 
sequences can be used to code for a particular set of 
amino acids. The codons encoding the other amino acids 

20 are TTT and TTC for phenylalanine; TTA, TTG, CTT, CTC, CTA 
and CTG for leucine;- ATT, ATC and ATA for isoleucine; ATG 
for methione; GTT, GTC, GTA and GTG for valine; CCT, CCC, 
CCA and CCG for proline; GCU, GCC, GCA and GCG for 
alanine; TAT and TAC for tyrosine; CAT and CAC for 

25 hlstidine; CAA and CAG for glutamine; AAT and AAC for 
asparagine; AAA arid AAfc for lysine; GAT and GAC for 
aspartic acid; GAA and GAG* for glutamic acid; TGT and TGC 
for cysteine; TGG for tryptophan; CGT, CGC, CGA and CGG 
for arginine; and GGU, GGC, GGA and GGG for glycine. 

30 - Charts depicting the codons (i.e., the genetic code) can 
be found in various general biology or biochemistry 
textbooks. 

In the method' of the present invention, if the 
portion (s) of the gene encoding the inhibitory/instability 
35 regions are AT- rich, it is preferred, but not believed to 



be necessary, that most or all of the mutations in the 
inhibitory/instability region be the replacement of A and 
T with G and C nucleotides, making the regions more GC- 
rich, while still maintaining the coding capacity of the 
gene. If the portion (s) of the gene encoding the 
inhibitory/instability regions are GC-rich, it is 
preferred, but not believed to be necessary, that most or 
all of the mutations in the inhibitory/instability region 
be the replacement of G and C nucleotides with A and T 
nucleotides, making the regions less GC-rich, while still 
maintaining the coding capacity of the gene. If the INS 
region is either AT- rich or GC-rich, it is most preferred 
that it be altered so that it has a content of about 50% G 
and C and about 50% A and T. The AT- (or AU- ) content 
(or, alternatively, the GC- content) of an 

inhibitory/instability region or regions can be calculated 
by using a computer program designed to make such 
calculations. Examples of such programs, used to 
determine the AT- richness of the HIV-1 gag 

inhibitory/instability regions exemplified herein, are the 
GCG Analysis Package for the VAX (University of Wisconsin) 
and the Gene Works Package (Intelligenetics) . 

In the method of the invention, if the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. If 
desired, however (e.g., to make an AT-rich region more GC- 
rich), more -preferred codons can be altered to less- 
preferred codons. It is also preferred, but not believed 
to be necessary, that less -preferred or rarely used codons 
be replaced with more -preferred codons. Optionally, only 
the most rarely used codons (identified from published 
codon usage tables, such as in T. Maruyama et al., Nucl. 
Acids Res. 14 (Supp) :rl51-197 (1986)) can be replaced with 
preferred codons, or alternatively, most or all of the 
rare codons can be replaced with preferred codons. 
Generally, the choice of preferred codons to use will 
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depend on the codon usage of the host cell in which the 
altered gene is to be expressed. Note, however, that the 
substitution of more -preferred codons with less -preferred 
codons is also functional, as shown in the example below. 

As noted above, coding sequences are chosen on 
5 the basis of the genetic code and, preferably on the 
preferred codon usage in the host cell or organism in 
which the mutated gene of this invention is to be 
expressed. In a number of cases the preferred codon usage 
of a particular host or expression system can be 

10 ascertained from available references ( see , e.g., T. 
Maruyama et al., Nucl . Acids Res. 14 (Supp) :rl51-197 
(1986)), or can be ascertained by other methods ( see . 
e.g., U.S. Patent No. 5,032,767 entitled "Codon Pair 
Utilization", issued to G. W. Hatfield et al. on January 

15 21, 1992, which is incorporated herein by reference) . 
Preferably, sequences will be chosen to optimize 
transcription and translation as well as mRNA stability so 
as to ultimately increase the amount of protein produced. 
Selection of codons is thus, for example, guided by the 

20 preferred use of codons by the host cell and/or the need 
to provide for desired restriction endonuclease sites and 
could also be guided by a desire to avoid potential 
secondary structure constraints in the encoded mRNA 
transcript. Potential secondary structure constraints can 

25 be identified by the use of computer programs such as the 
one described in M. Zucker et al., Nucl. Acids Res. 2.: 133 
(1981) . More than one coding sequence may be chosen in 
situations where the codon preference is unknown or 
ambiguous for optimum codon usage in the chosen host cell 

30 _ or organism. However, any correct set of codons would 

encode the desired protein, even if translated with less 
than optimum efficiency. 

In the method of the invention, if the INS 
region contains conserved nucleotides, it is also 

35 preferred, but not believed to be necessary, that 



° conserved nucleotides sequences in the 

inhibitory/instability region be mutated. Optionally, at 
least approximately 75% of the mutations made in the 
inhibitory/instability region may involve the mutation of 
conserved nucleotides. Conserved nucleotides can be 
5 determined by using a variety of computer programs 
available to practitioners of the art. 

In the method of the invention, it is also 
anticipated that inhibitory/instability sequences can be 
mutated such that the encoded amino acids are changed to 

10 contain one or more conservative or non- conservative amino 
acids yet still provide for a functionally equivalent 
protein. For example, one or more amino acid residues, 
within the sequence can be substituted by another amino 
acid of a similar polarity which acts as a functional 

15 equivalent, resulting in a neutral substitution in the 

amino acid sequence. Substitutes for an amino acid within 
the sequence may be selected from other members of the 
class to which the amino acid belongs. For example, the 
nonpolar (hydrophobic) amino acids include alanine, 

20 leucine, isoleucine, valine, proline, phenylalanine, 

tryptophan and methionine. The polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine. The positively charged (basic) 
amino acids include arginine, lysine and histidine. The 

25 negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid-. 

In the exemplified method of the present 
invention, all of the regions in the HIV-1 gag gene 
suspected to have inhibitory/instability activity were 

30 . first mutated at once over a region approximately 270 
nucleotides in length using clustered site-directed 
mutagenesis with four different oligonucleotides spanning 
a region of approximately 300 nucleotides to generate the 
construct pl7M1234, described infra , which encodes a 

35 stable mRNA. 



The four oligonucleotides, which are depicted in 
Fig. 4, are 

Ml : ccagggggaaagaagaagtacaagctaaagcacatcgtatgggcaagcagg 
(SEQ ID NO: 6) ; M2 : 

ccttcagacaggatcagaggagcttcgatcactatacaacacagtagc (SEQ ID 
NO: 7) ; M3: 

accctctattgtgtgcaccagcggatcgagatcaaggacaccaaggaagc (SEQ ID 
NO: 8) ; and M4: 

gagcaaaacaagtccaagaagaaggcccagcaggcagcagctgacacagg (SEQ ID 
NO: 9) . These oligonucleotides are 51 (Ml) , 48 (M2) , 50 
(M3) and 50 (M4) nucleotides in length. Each 
oligonucleotide introduced several point mutations over an 
area of 19-22 nucleotides ( see infra). The number of 
nucleotides 5' to the first mutated nucleotide were 14 
(Ml); 18 (M2); 17 (M3); and 11 (M4) ; and the number of 
nucleotides 3' to the last mutated nucleotide were 15 
(Ml); 8 (M2); 14 (M3); and 17 (M4). The ratios of AT to 
GC nucleotides present in each of these regions before 
mutation was 33AT/18GC (Ml); 30AT/18GC (M2); 29AT/21GC 
(M3) and 27AT/23GC (M4) . The ratios of AT to GC 
nucleotides present in each of these regions after 
mutation was 25AT/26GC (Ml); 24AT/24GC (M2) ; 23AT/27GC 
(M3) and 22AT/28GC (M4) . A total of 26 codons were 
changed. The number of times the codon appears in human 
genes per 1000 codons (from T. Maruyama et al . , Nuc. Acids 
Res. 14 (Supp.) :rl51-rl97 (1986)) is listed in parentheses 
next to the codon*. Iit the example, 8 codons encoding 
lysine (Lys) were changed from aaa (22.0) to aag (35.8); 
two codons encoding tyrosine (Tyr) were changed from tat 
(12.4) to tac (18.4); two codons encoding leucine (Leu) 
were changed from tta (5.9) to eta (6.1); two codons 
encoding histidine (His) were changed from cat (9.8) to 
cac (14.3); three codons encoding isoleucine (lie) were 
changed from ata (5.1) -to ate (24.0); two codons encoding 
glutamic acid (Glu) were changed from gaa (26.8) to gag 
(41.6); one codon encoding arginine (Arg) was changed from 
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° aga (10.8) to cga (5,2) and one codon encoding arginine 
(Arg) was changed from agg (11.4) to egg (7.7); one codon 
encoding asparagine (Asn) was changed from aat (16.9) to 
aac (23.6); two codons encoding glutamine (Gin) were 
changed from caa (11.5) to cag (32.7); one codon encoding 
5 serine (Ser) was changed from agt (8.7) to tec (18.7); and 
one codon encoding alanine (Ala) was changed from gca 
(12.7) to gec (29.8) . 

The techniques of oligonucleotide-directed site- 
specific mutagenesis employed to effect the modifications 

10 in structure or sequence of the DNA molecule are known to 
those of skill in the art. The target DNA sequences which 
are to be mutagenized can be cDNA, genomic DNA or 
synthesized DNA sequences. Generally, these DNA sequences 
are cloned into an appropriate vector, e.g., a 

15 bacteriophage M13 vector, and single- stranded template DNA 
is prepared from a plaque generated by the recombinant 
bacteriophage. The single -stranded DNA is annealed to the 
synthetic oligonucleotides and the mutagenesis and 
subsequent steps are performed by methods well known in 

20 the art. See , e.g., M. Smith and S. Gillain, in Genetic 
Engineering: Principles and Methods . Plenum Press 3.: 1-32 
(1981) (review) and T. Kunkel, Proc. Natl. Acad. Sci. USA 
82:488-492 (1985). See also . Sambrook et al . (1989), 
supra . The synthetic oligonucleotides can be synthesized 

25 on a DNA synthesizer (e.g., Applied Biosystems) and 

purified by electrophoresis by methods known in the art. 
The length of the selected* or prepared 

oligodeoxynucleotides using this method can vary. There 
are no absolute size limits. As a matter of convenience, 

30 _ for use in the process of this invention, the shortest 
length of the oligodeoxynucleotide is generally 
approximately 20 nucleotides and the longest length is 
generally approximately- 60 to 100 nucleotides. The size 
of the oligonucleotide primers are determined by the 

35 requirement for stable hybridization of the primers to the 
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regions of the gene in which the mutations are to be 
induced, and by the limitations of the currently available 
methods for synthesizing oligonucleotides. The factors to 
be considered in designing oligonucleotides for use in 
oligonucleotide-directed mutagenesis (e.g., overall size, 
size of portions flanking the mutation (s) ) are described 
by M. Smith and S. Gillam in Genetic Engineering; 
Principles and Methods . Plenum Press 3.-1-32 (1981). In 
general, the overall length of the oligonucleotide will be 
such as to optimize stable, unique hybridization at the 
mutation site with the 5' and 3' extensions from the 
mutation site being of sufficient size to avoid editing of 
the mutation (s) by the exonuclease activity of the DNA 
polymerase. Oligonucleotides used for mutagenesis in the 
present invention will generally be at least about 20 
nucleotides, usually about 40 to 60 nucleotides in length 
and usually will not exceed about 100 nucleotides in 
length. The oligonucleotides will usually contain at 
least about five bases 3' of the altered codons. 

In the preferred mutagenesis protocol of the 
present invention, the INS containing expression vectors 
contain the BLUESCRIPT plasmid vector as a backbone. This 
enables the preparation of double- stranded as well as 
single -stranded DNA. Single- stranded uracil containing 
DNA is prepared according to a standard protocol as 
follows: The plasmid is transformed into a F' bacterial 
strain (e.g.. DHSaF' ) A colony is grown and infected 
with the helper phage M13-VCS [Stratagene #20025; IxlO 11 
pfu/ml] . This phage is used to infect a culture of the E . 
coli strain CJ236 and single- stranded DNA is isolated 
according to standard methods. 0.25 ug of single -stranded 
DNA is annealed with the synthesized oligonucleotides (5 
ul of each oligo, dissolved at a concentration of 5 
OD 260 /ml. The synthesized oligonucleotides are usually 
about 40 to 60 nucleotides in length and are designed to 
contain a perfect match of approximately 10 nucleotides at 



each end. They may contain as many changes as desired 
within the remaining 20-40 nucleotides. The 
oligonucleotides are designed to cover the region of 
interest and they may be next to each other or there may 
be gaps between them. Up to six different 

oligonucleotides have been used at the same time, although 
it is believed that the use of more than six 
oligonucleotides at the same time would also work in the 
method of this invention. After annealing, elongation 
with T4 polymerase produces the second strand which does 
not contain uracil. The free ends are ligated using 
ligase. This results in double -stranded DNA which can be 
used to transform coli strain HB101. The mutated 
strand which does not contain uracil produces double - 
stranded DNA, which contains the introduced mutations. 
Individual colonies are picked and the mutations are 
quickly verified by sequence analysis. Alternatively or 
additionally, this mutagenesis method can (and has been) 
used to select for different combinations of 
oligonucleotides which result in different mutant 
phenotypes. This facilitates the analysis of the regions 
important for function and is helpful in subsequent 
experiments because it allows the analysis of exact 
sequences involved in the INS . In addition to the 
exemplified mutagenesis of the INS-1 region of HIV-1 
described herein, this method has also been used to mutate 
in one step a region of 150 nucleotides using three 
tandemly arranged oligonucleotides that introduced a total 
of 35 mutations. The upper limit of changes is not clear, 
but it is estimated that regions of approximately 500 
nucleotides can be changed in 20% of their nucleotides in 
one step using this protocol. 

The exemplified method of mutating by using 
oligonucleotide-directed site-specific mutagenesis may be 
varied by using other methods known in the art. For 
example, the mutated gene can be synthesized directly 



using overlapping synthetic deoxynucleotides ( see , e.g., 
Edge et al . , Nature 292:756 (1981); Nambair et al . , 
Science 223:1299 (1984); Jay et al . , J. Biol. Chem. 
259:6311 (1984); or by using a combination of polymerase 
chain reaction generated DNAs or cDNAs and synthesized 
oligonucleotides . 

4. Determination of Stability of the 
Mutated mRNA 

The steady state level and/or stability of the 
resultant mutated mRNAs can be tested in the same manner 
as the steady state level and/or stability of the 
unmodified mRNA containing the inhibitory/instability 
regions are tested (e.g., by Northern blotting), as 
discussed in section 1, above. The mutated mRNA can be 
analyzed along with (and thus compared to) the unmodified 
mRNA containing the inhibitory/instability region (s) and 
with an unmodified indicator mRNA, if desired. As 
exemplified, the HIV-l pi7 8a * mutants are compared to the 
unmutated HIV-l pl7 8a8 in transfection experiments by 
subsequent analysis of the mRNAs by Northern blot 
analysis. The proteins produced by these mRNAs are 
measured by immunoblotting and other methods known in the 
art, such as ELISA. See infra . 

VI. INDUSTRIAL APPLICABILITY 

Genes which ran be mutated by the methods of 
this invention include those whose mRNAs are known or 
suspected of containing INS regions in their mRNAs. These 
genes include, for example, those coding for growth 
factors, interferons, interleukins, the fos proto- oncogene 
protein, and HIV-l gag, env and pol, as well as other 
viral mRNAs in addition to those exemplified herein. 
Genes mutated by the methods of this invention can be 
expressed in the native host cell or organism or in a 
different cell or organism. The mutated genes can be 
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introduced into a vector such as a plasmid, cosmid, phage, 
virus or mini -chromosome and inserted into a host cell or 
organism by methods well known in the art. In general, 
the mutated genes or constructs containing these mutated 
genes can be utilized in any cell, either eukaryotic or 
5 prokaryotic, including mammalian cells (e.g., human (e.g., 
HeLa) , monkey (e.g., Cos), rabbit (e.g., rabbit 
reticulocytes), rat, hamster (e.g., CHO and baby hamster 
kidney cells) or mouse cells (e.g., L cells), plant cells, 
yeast cells, insect cells or bacterial cells (e.g., EL. 

10 coli) . The vectors which can be utilized to clone and/or 
express these mutated genes are the vectors which are 
capable of replicating and/or expressing the mutated genes 
in the host cell in which the mutated genes are desired to 
be replicated and/or expressed. See , e.g., F. Ausubel et 

15 al., Current Protocols in Molecular Biology . Greene 

Publishing Associates and Wiley- Interscience (1992) and 
Sambrook et al. (1989) for examples of appropriate vectors 
for various types of host cells. The native promoters for 
such genes can be replaced with strong promoters 

20 compatible with the host into which the gene is inserted. 
These promoters may be inducible. The host cells 
containing these mutated genes can be used to express 
large amounts of the protein useful in enzyme 
preparations, pharmaceuticals, diagnostic reagents, 
' 25 vaccines and therapeutics. 

Genes alteired by the methods of the invention or 
constructs containing said genes may also be used for in- 
vivo or in-vitro gene replacement. For example, a gene 
which produces an mRNA with an inhibitory instability 

30 region can be replaced with a gene that has been modified 
by the method of the invention in situ to ultimately 
increase the amount of protein expressed. Such gene 
include viral genes and/or cellular genes. Such gene 
replacement might be useful, for example, in the 

35 development of a vaccine and/or genetic therapy. 
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° The constructs and/or proteins made by using 

constructs encoding the exemplified altered gag, env, and 
pol genes could be used, for example, in the production of 
diagnostic reagents, vaccines and therapies for AIDS and 
AIDS related diseases. The inhibitory/instability 
5 elements in the exemplified HIV-1 gag gene may be involved 
in the establishment of a state of low virus production in 
the host. HIV-1 and the other lentiviruses cause chronic 
active infections that are not cleared by the immune 
system. It is possible that complete removal of the 

10 inhibitory/instability sequence elements from the 

lentiviral genome would result in constitutive expression. 
This could prevent the virus from establishing a latent 
infection and escaping immune system surveillance. The 
success in increasing expression of pl7 £ag by eliminating 

15 the inhibitory sequence element suggests that one could 

produce lentiviruses without any negative elements. Such 
lentiviruses could provide a novel approach towards 
attenuated vaccines. 

For example, vectors expressing high levels of 

20 Gag can be used in immunotherapy and immunoprophylaxis, 
after expression in humans. Such vectors include 
retroviral vectors and also include direct injection of 
DNA into muscle cells or other receptive cells, resulting 
in the efficient expression of gag, using the technology 

25 described, for example, in Wolff et al., Science 247:1465- 
1468 (1990), Wolff et al.', Human Molecular Genetics 
l(6):363-369 (1992) and Ulmer et al., Science 259:1745- 
1749 (1993) * Further, the gag constructs could be used in 
transdominant inhibition of HIV expression after the 

30 introduction into humans. For this application, for 

example, appropriate vectors or DNA molecules expressing 
high levels of p55 gag or p37 w would be modified to generate 
transdominant gag mutants/ as described, for example, in 
Trono et al., Cell 59:113-120 (19B9) . The vectors would 

35 be introduced into humans, resulting in the inhibition of 
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° HIV production due to the combined mechanisms of gag 

transdominant inhibition and of immunostimulation by the 
produced gag protein. In addition, the gag constructs of 
the invention could be used in the generation of new 
retroviral vectors based on the expression of lentiviral 
5 gag proteins . Lentiviruses have unique characteristics 
that may allow the targeting and efficient infection of 
non-dividing cells. Similar applications are expected for 
vectors expressing high levels of env. 

Identification of similar inhibitory/instability 

10 elements in SIV indicates that this virus may provide a 
convenient model to test these hypotheses. 

The exemplified constructs can also be used to 
simply and rapidly detect and/or further define the 
boundaries of inhibitory/instability sequences in any mRNA 

15 which is known or suspected to contain such regions, e.g., 
in mRNAs encoding various growth factors, interferons or 
interleukins, as well as other viral mRNAs in addition to 
those exemplified herein. 

The following examples illustrate certain 

20 embodiments of the present invention, but should not be 
construed as limiting its scope in any way. Certain 
modifications and variations will be apparent to those 
skilled in the art from the teachings of the foregoing 
disclosure and the following examples, and these are 

25 intended to be encompassed by the spirit and scope of the 
invention. 

EXAMPLE 1 
HIV-1 GAG GENE 

30 The interaction of the Rev regulatory protein of 

human immunodeficiency virus type 1 (HIV-1) with its RNA 
target, named the Rev- responsive element (RRE) , is 
necessary for expression of the viral structure proteins 
(for reviews see G. Pavlakis and B. Felber, New Biol. 

35 2:20-31 (1990); B. Cullen and W. Greene, Cell £fi:423-426 
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° (1989); and C. Rosen and G. Pavlakis, AIDS J.' 4:499-509 
(1990))* Rev acts by promoting the nuclear export and 
increasing the stability of the RRE - containing mRNAs. 
Recent results also indicate a role for REV in the 
efficient polysome association of these mRNAs (S. Arrigo 
5 and I. Chen, Gene Dev. 1:808-819 (1991), D. D'Agostino et 
al., Mol. Cell Biol. 12:1375-1386 (1992)). Since the RRE- 
containing HIV-1 mRNAs do not efficiently produce protein 
in the absence of Rev, it has been postulated that these 
mRNAs are defective and contain inhibitory/ instability 

10 sequences variously designated as INS, CRS, or IR (M. 

Emerman et al. Cell 57:1155-1165 (1989); S. Schwartz et 
al., J. Virol. £6:150-159 (1992); C. Rosen et al., Proc. 
Natl. Acad. Sci. USA 85:2071-2075 (1988); M. Hadzopoulou- 
Cladaras et al., J. Virol. £3:1265-1274 (1989); F. 

15 Maldarelli et al., J. Virol. 65:5732-5743 (1991); A. W. 
Cochrane et al., J. Virol. £1:5305-5313 (1991)). The 
nature and function of these inhibitory/instability 
sequences have not been characterized in detail. It has 
been postulated that inefficiently used splice sites may 

20 be necessary for Rev function (D. Chang and P. Sharp, Cell 
!£:789-795 (1989)); the presence of such splice sites may 
confer Rev -dependence to HIV-1 mRNAs. 

Analysis of HIV-1 hybrid constructs led to the 
initial characterization of "some inhibitory/instability 

25 sequences in the gag and pol regions of HIV-1 (S. Schwartz 
et al., J. Virol. ££: 150-159 (1992); F. Maldarelli et al., 
J Virol 65:5732-5743 (1991); . A. W. Cochrane et al., J. 
Virol. 65:5305-5313 (1991)). The identification of an 
inhibitory/instability RNA element located in the coding 

30 region of the pl7 8t * matrix protein of HIV-1 was also 
reported (S. Schwartz et al., J. Virol. 6£:150-159 
(1992)). It was shown that this sequence acted in cis to 
inhibit HIV-1 tat expression after insertion into a tat 
cDNA. The inhibition could be overcome by Rev-RRE, 

35 demonstrating that this element plays a role in regulation 



by Rev. 



1. p!7 gig expression plasmid 

To further study the inhibitory/instability 
element in P17***, a pl7 gae expression plasmid (pl7, Fig. 1) 
was constructed. The pl7 Jae sequence was engineered to 
contain a translational stop codon immediately after the 
coding sequence and thus could produce only pl7 gag (the 
construction of this plasmid is described below) . The 
major 5' splice site of HIV-1 upstream of the gag AUG has 
been deleted from this vector (B. Felber et al., Proc. 
Natl. Acad. Sci. USA 6£: 1495-1499 (1989)). To investigate 
whether plasmid pl7 could produce pl7 ga * in- the absence of 
Rev and the RRE, pl7 was transfected into HLtat cells (S. 
Schwartz et al., J. Virol. 64:2519-2529 (1990)) (see 
below) . These cells constitutively produce HIV-1 Tat 
protein, which is necessary for transact ivat ion of the 
HIV-1 LTR promoter. Plasmid pl7 was transfected in the 
absence or presence of Rev, and the production of pl7 ga * 
was analyzed by western immunoblotting. The results 
revealed that very low levels of pl7 gag protein were 
produced (Fig. 2A) . The presence of Rev did not increase 
gag expression, as expected, since this mRNA did not 
contain the RRE. Next, a plasmid that contained both the 
pl7 gae coding sequence and the RRE (pl7R, Fig. 1) was 
constructed. Like pl7, this plasmid produced very low 
levels of pl7 gag in the absence of Rev. High levels of 
pl7 g * were produced only in the presence of Rev (Fig. 2A) . 
These experiments suggested that an inhibitory/ instability 
element was located in the pl7 g4g coding sequence. 

Expression experiments using various eucaryotic 
vectors have indicated that several other retroviruses do 
not contain such inhibitory/instability sequences within 
their coding sequences (s$e for example, J. Wills et al., 
J. Virol. 62:4331-43 (1989) and V. Morris et al., J. 
Virol. .62:349-53 (1988)). To verify these results, the 
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pl7 gag (matrix) gene of HIV-1 in plasmid pl7 was replaced 
with the coding sequence for pl9 w (matrix) which is the 
homologous protein of the Rous sarcoma virus (RSV, strain 
SR-A) . The resulting plasmid, p!9 (Fig. 1) , was identical 
to plasmid p!7, except for the gag coding sequence. The 
production of p!9 m protein from plasmid pl9 was analyzed 
by western immunoblotting, which revealed that this 
plasmid produced high levels of pl9*** (Fig. 2A) . These 
experiments demonstrated that the pl9 eag coding sequence of 
RSV, in contrast to pl7 ga * of HIV-l, could be efficiently 
expressed in this vector, indicating that the gag region 
of RSV did not contain any inhibitory/instability 
elements. A derivative of plasmid pl9 that contained the 
RRE, named pl9R (Fig. 1) was also constructed. 
Interestingly, only very low levels of pl9 81g protein were 
15 produced from the RRE- containing plasmid pl9R in the 
absence of Rev. This observation indicated that the 
introduced RRE and 3' HIV-1 sequences exerted an 
inhibitory effect on p!9 gns expression from plasmid pl9R, 
which is in agreement with recent data indicating that in 
the absence of Rev, a longer region at the 3' end of the 
virus including the RRE acts as an inhibitory/instability 
element (G. Nasioulas, G. Pavlakis, B. Felber, manuscript 
in preparation) . In conclusion, the high levels of 
expression of RSV pl9 ga * in the same vector reinforced the 
25 conclusion that an inhibitory/instability sequence within 
HIV-1 pl7 w coding region was responsible for the very low 
levels of expression. 

It was next determined whether the 
inhibitory/instability effect of the pl7 w coding sequence 

on 

was detected also at the mRNA level. Northern blot 
analysis of RNA extracted from HLtat cells transfected 
with pl7 or transfected with pl7R demonstrated that pl7R 
produced lower mRNA level? in the absence of Rev (Fig. 3A) 
(See Example 3). A two- to eight-fold increase in pl7R 
mRNA levels was observed after coexpression with Rev. 
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° Plasmid pl7 produced mRNA levels similar to those produced 
by pl7R in the absence of Rev. Notably, Rev decreased the 
levels of mRNA and protein produced by mRNAs that do not 
contain RRE . This inhibitory effect of Rev in 
cotransfection experiments has been observed for many 
5 other non- RRE -containing mRNAs, such as lucif erase and CAT 
(L. Solomin et al., J. Virol £4:6010-6017 (1990); D. M. 
Benko et al., New Biol 2:1111-1122 (1990)). These results 
established that the inhibitory element in gag also 
affects the mRNA levels and are in agreement with previous 

10 findings (S. Schwartz et al., J. Virol. 66:150-159 

(1992)). Quantitations of the mRNA and protein levels 
produced by p!7R in the absence or presence of Rev were 
performed by scanning densitometry of appropriate serial 
dilutions of the samples, and indicated that the 

15 difference was greater at the level of protein (60- to 

100-fold) than at the level of mRNA (2- to 8-fold). This 
result is compatible with previous findings of effects of 
Rev on mRNA localization and polysomal loading of both gag 
and env mRNAs (S. Arrigo et al., Gene Dev £.-808-819 

20 (1991); D. D'Agostino et al., Mol. Cell. Biol. 12:1375- 

1386 (1992); M. Emerman et al . , Cell £7:1155-1165 (1989); 
B. Felber et al., Proc. Natl. Acad. Sci. USA 86.: 1495-1499 
(1989), M. Malim et al . , Nature (London) 338:254-257 
(1989)). Northern blot analysis of the mRNAs produced by 

25 the RSV gag expression plasmids revealed that pl9 produced 
high mRNA levels (Fig. 3B) . This further demonstrated 
that the p!9 gtg coding sequence of RSV does not contain 
inhibitory elements. The presence of the RRE and 3 9 HIV-1 
sequences in plasmid pl9R resulted in decreased mRNA 

30 levels in the absence of Rev, further suggesting that 
inhibitory elements were present in these sequences. 
Taken together, these results established that gag 
expression in HIV-l is fundamentally different from that 
in RSV. The HIV-1 pl7 w coding sequence contains a strong 

35 inhibitory element while the RSV pl9 E * coding sequence 
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does not. Interestingly, plasmid pl9 contains the 5' 
splice site used to generate the RSV env mRNA, which is 
located downstream of the gag AUG, This 5* splice site, is 
not utilized in the described expression vectors (Pig. 
3B) . Mutation of the invariable GT dinucleotide of this 
5 5' splice site to AT did not affect pl9 gae expression 

significantly (data not shown) . On the other hand, the 
HIV-1 pl7 expression plasmid did not contain any known 
splice sites, yet was not expressed in the absence of Rev. 
These results further indicate that sequences other than 
10 inefficiently used splice sites are responsible for 
inhibition of gag expression. 

2 . Mutated p!7 gag vectors 
To investigate the exact nature of the 
15 inhibitory element in HIV-1 gag, site-directed mutagenesis 
of the pl7 £ag coding sequence with four different 
oligonucleotides, as indicated in Fig. 4, was performed. 
Each oligonucleotide introduced several point mutations 
over an area of 19-22 nucleotides. These mutations did 
not affect the amino acid sequence of the pl7 gafi protein, 
since they introduced silent codon changes. First, all 
four oligonucleotides were used simultaneously in 
mutagenesis using a single -stranded DNA template as 
described (T. Kunkel, Proc. Natl. Acad. Sci. USA 82.: 488- 
492 (1985); S. Schwartz et al., Mol. Cell. Biol. 12:207- 
219 (1992) )• This allowed the simultaneous introduction 
of many point mutations over. a large region of 270 nt in 
vector pl7. A mutant containing all four oligonucleotides 
was isolated and named pl7M1234. Compared to pl7, this 
plasmid contained a total of 28 point mutations 
distributed primarily in regions with high AU- content. 
The phenotype of the mutant was assessed by transf ections 
into HLtat cells and subsequent analysis of pl7 w 
expression by immunoblotting. Interestingly, pl7M1234 
produced high levels of p!7 g * protein, higher than those 
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° produced by pl7R in the presence of Rev (Fig/ 2A) . This 
result demonstrated that the inhibitory/instability 
signals in pl7 w mRNA had been inactivated in plasmid 
pl7M1234. As expected, the presence of Rev protein did 
not increase expression from pl7M1234, but instead, had a 
5 slight inhibitory effect on gag expression. Thus, pl7** 
expression from the mutant pl7M1234 displayed the same 
general properties as the pl9** of RSV, that is, a high 
constitutive level of Rev- independent gag expression. 
Northern blot analysis revealed that the mRNA levels 

10 produced by pl7M1234 were increased compared to those 
produced by pl7 (Fig. 3 A) . 

To further examine the nature and exact location 
of the minimal inhibitory/instability element, the pl7 w 
coding sequence in plasmid pl7 was mutated with only one 

15 0 f the four mutated oligonucleotides at a time. This 

procedure resulted in four mutant plasmids, named pl7Ml f 
pl7M2, pl7M3, and pl7M4, according to the oligonucleotide 
that each contains. None of these mutants produced 
significantly higher levels of p!7 m protein compared to 

20 plasmid pl7 (Fig. 5} , indicating that the 

inhibitory/instability element was not affected. The pl7 
coding sequence was next mutated with two oligonucleotides 
at a time. The resulting mutants were named pl7M12, 
pl7M13, pl7M14, pl7M23, pl7M24, and p!7M34. Protein 

25 production from these mutants was minimally increased 

compared with that from pl7, and it was considerably lower 
than that from pl7M1234 (Fig.- 5) . In addition, a triple 
oligonucleotide mutant, pl7M123, also failed to express 
high levels of pl7« (data not shown) . These findings may 

30 SU g gest that multiple inhibitory/instability signals are 
present in the coding sequence of pl7***. Alternatively, a 
single inhibitory/instability element may span a large 
region, whose inactivation requires mutagenesis with more 
than two oligonucleotides. This possibility is consistent 

35 with previous data suggesting that a 218 -nucleotide 
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inhibitory/instability element in the pl7 glg coding 
sequence is required for strong inhibition of gag 
expression. Further deletions of this sequence resulted 
in gradual loss of inhibition (S.* Schwartz et al . , J. 
Virol. £6:150-159 (1992)). The inhibitory/ instability 
element may coincide with a specific secondary structure 
on the mRNA. It is currently being investigated whether a 
specific structure is important for the function of the 
inhibitory/instability element. 

The pl7 w coding sequence has a high content of 
A and U nucleotides, unlike the coding sequence of pl9 gag 
of RSV (S. Schwartz et al./ J. Virol. 66:150-159 (1992); 
G. Myers and G. Pavlakis, in The Retroviridae J. Levy, 
Eds. {Plenum Press, New York, NY, 1992), pp. 1-37). Four 
regions with high AU content are present in the pl7 8a£ 
coding sequence and have been implicated in the inhibition 
of gag expression (S. Schwartz et al., J. Virol. 66:150- 
159 (1992)). Lentiviruses have a high AU content compared 
to the mammalian genome. Regions of high AU content are 
found in the gag/pol and env regions, while the multiply 
spliced mRNAs have a lower AU content (G. Myers and G. 
Pavlakis, in The Retroviridae . J. Levy, Eds. (Plenum 
Press, New York, NY, 1992) , pp. 1-37) , supporting the 
possibility that the inhibitory/instability elements are 
associated with mRNA regions with high AU content. It has 
been shown that a specific oligonucleotide sequence, 
AUUUA, found at the AU-rich 3' untranslated regions of 
some unstable mRNAs, may confer RNA instability (G. Shaw 
and R. Kamen, Cell 4£:659-667 (1986)). Although this 
sequence is not present in the pl7 w sequence, it is found 
in many copies within gag/pol and env regions. The 
association of instability elements with AU-rich regions 
is not universal, since the RRE together with 3' HIV 
sequences, which shows a strong inhibitory/ instability 
activity in our vectors, is not AU-rich. These 
observations suggest the presence of more than one type of 



inhibitory/instability sequences. In addition to reducing 
the AU content, some of the mutations introduced in 
plasmid pi7 changed rarely used codons to more favored 
codons for human cells. Although the use of rare codons 
could be an alternative explanation for poor HIV gag 
expression, this type of translational regulation is not 
favored by these results, since the presence of Rev 
corrects the defect in gag expression. In addition, the 
observation that the presence of non- translated sequences 
reduced gag expression (for example, the RRE sequence in 
pl7R) , suggests that translation of the 
inhibitory/instability region is not necessary for 
inhibition. Introduction of RRE and 3' HIV sequences in 
pl7M1234 was also able to decrease gag expression, 
verifying that independent negative elements not acting 
co- trans la tionally are responsible for poor expression. 

3. Identification and elimination of 

additional INS sequences in the p24 and pl5 
regions of the gag gene 

To examine the effect of removal of INS in the 
pl7 eag coding region (the pl7 Ea * coding region spans 
nucleotides 336-731, as described in the description of 
Fig. 1. (B) above, and contains the first of three parts 
(i.e., pl7, p24 f and pl5) of the gag coding region, as 
indicated on in Fig. 1. (A) and (B) ) on the expression of 
the complete gag gene expression vectors were constructed 
in which additional sequences of the gag gene were 
inserted 3' to the mutationally altered pl7*** coding 
region, downstream of the stop codon, of vector pl7Ml234. 
Three vectors containing increasing lengths of gag 
sequences were studied: pl7M1234 (731-1081) , pl7M1234 (731- 
1424) and pi 7M1234 (731-2165) , as shown in Fig. 1. (C) . 
Levels of expression of pl7 et * were measured, with the 
results indicating that region of the mRNA encoding the 
second part of the gag protein (i.e., the part encoding 
the p24 w protein, which spans nucleotides 731-1424) 
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contains only a weak INS, as determined by a small 
reduction in the amount of pl7 gtg protein expressed by 
pl7M1234 as compared with the amount of pl7 ea * protein 
expressed by pl7Ml234 (731- 1424) , while the region of the 
mRNA encoding the third part of the gag protein (i.e., the 
part encoding the pl5 $M * protein, which spans nucleotides 
1425-2165) contains a strong INS, as determined by a large 
reduction in the amount of gag protein expressed by 
pl7M1234 (731-2165) as compared with the amount of protein 
expressed by p!7M1234 and pl7M1234 (731-1424) . 



4 . p37M1234 vector 

The above analysis allowed the construction of 
vector p37M1234, which expressed high levels of p37 8 ** 
precursor protein (which contains both the pl7 fa * and p24 gas 
^5 protein regions) - Vector p37M1234 was constructed by 

removing the stop codon at the end of the gene encoding 
the altered pl7 ga * protein and fusing the nucleotide 
sequence encoding the p24 gag protein into the correct 
reading frame by oligonucleotide mutagenesis. This 
restored the nucleotide sequence so that it encoded the 
fused pl7 sas and p24 w protein (i.e., the p37 ea * protein) as 
it is encoded by HIV-1. Since the presence of the p37 gag 
or of the p24 gag protein can be quantitated easily by 
commercially available ELISA kits, vector p37M1234 can be 
used for inserting and testing additional fragments 
suspected of containing INS . Examples of such uses are 
shown below. 



35 



5. Vectors P17M1234 (731-108DNS and P55BM1234 
Other vectors which were constructed in a 
similar manner as was P37M1234 were pl7M1234 (731-108DNS 
and p55BM1234 (Fig. 1. (C) ) . The levels of gag expression 
from each of these three vectors which allow the 
translation of the region downstream (3') of the pl7 
coding region, was respectively similar to the level of 
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° gag expression from the vectors containing the nucleotide 
sequences 3' to a stop codon (i.e., vectors pl7M1234 (731- 
1081), pl7M1234 (731-1424) and pl7M1234 (731-2165) , 
described above) . These results also demonstrate that the 
INS regions in the gag gene are not affected by 

5 translation or lack thereof through the INS region. These 
results demonstrate the use of pl7M1234 to detect 
additional INS sequences in the HIV-1 gag coding region 
(i.e., in the 1424-2165 encoding region of HIV-1 gag). 
Thus, these results also demonstrate how a gene containing 
10 one or more inhibitory/instability regions can be mutated 
to eliminate one inhibitory/instability region and then 
used to further locate additional inhibitory/instability 
regions within that gene, if any. 

15 6. Vectors p37Ml-10D and P55M1-10 

As described above, experiments indicated the 
presence of INS in the p24 and pl5 region of HIV-1 in 
addition to those identified and eliminated in the p!7 g< * 
region of HIV-1. This is depicted schematically in Figure 
20 6 OI1 page 7180 of Schwartz et al. f J. Virol. 66:7176-7182 
(1992) . In that figure, cgagM1234 is identical to 
p55BM1234. 

By studying the expression of p24 gag protein in 
vectors encoding the p24 8tfi protein containing additional 

25 gag and pol sequences, it was found that vectors that 

contained the complete gag gene and part of the pol gene 
(e.g. vector p55BM1234, see Fig. 6) were not expressed at 
high levels, despite the elimination of INS-1 in the pl7 w 
region as described above. The inventors have 

3° hypothesized that this is caused by the presence of 

multiple INS regions able to act independently of each 
other. To eliminate the additional INS, several mutant 
HIV-1 oligonucleotides were constructed (see Table 2) and 
incorporated in various gag expression vectors. For 

^ example, oligonucleotides M6gag, M7gag, M8gag and MIOgag 
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were introduced into p37M1234, resulting in p'37Ml-10D and 
the same oligonucleotides were introduced into p55BM1234, 
resulting in p55BMl-10. These experiments revealed a 
dramatic improvement of expression of p37 e ** (which is the 
pl7*** and p24 w precursor} and p55 E ** (which is the intact 
gag precursor molecule produced by HIV-l) upon the 
incorporation in the expression vectors p37M1234 and 
p55BM1234 of additional mutations contained in the 
oligonucleotides M6gag, M7gag, M8gag and MIOgag {described 
in Table 2) . Fig, 6 shows that expression was 
dramatically improved after the introduction of additional 
mutations* 

Of particular interest was p37Ml-10D, which 
produced very high levels of gag* This has been the 
highest producing gag construct (see Fig. 6) . 
Interestingly, addition of gag and pol sequences as in 
vectors p55BMl-10 and p55AMl-10 (Fig- 6} reduced the 
levels of gag expression. Upon further mutagenesis, the 
inhibitory effects of this region were partially 
eliminated as shown in Fig. 6 for vector p55Ml-13P0. 
Introduction of mutations defined by the gag region 
nucleotides MIOgag, Mllgag, M12gag, M13gag, and pol region 
nucleotide MOpol increased the levels of gag expression 
approximately six fold over vectors such as p55BMl-10. 

The HIV-l promoter was replaced by the human 
cytomegalovirus early promoter (CMV) in plasmids p37Ml-10D 
and p55Ml-13P0 to generate plasmids pCMV37Ml-10D and 
pCMV55Ml-13P0, respectively. . For this, a fragment 
containing the CMV promoter was amplified by PCR 
(nucleotides -670 to +73, where +1 is the start of 
transcription, see , Boshart, et al., Cell . 41, 521 
C1985)). This fragment was exchanged with the StuI - 
BssHII fragment in gag vectors p37Ml-10D and p55Ml-13P0, 
resulting in the replacement of the HIV-l promoter with 
that of CMV. The resulting plasmids were compared to 
those containing the HIV-l promoter after transfection in 



human cells, and gave similar high expression of gag. 
Therefore, the high expression of gag can be achieved in 
the total absence of any other viral protein. The 
exchange of the HIV-1 with other promoters is beneficial 
if constitutive expression is desirable and also for 
expression in other mammalian cells, such as mouse cells, 
in which the HIV-l promoter is weak. 

The constructed vectors p37Ml-10D and p55BMl-10 
can be used for the Rev independent production of p37*"* 
and p55 w proteins, respectively. In addition, these 
vectors can be used as convenient reporters, to identify 
and eliminate additional INS in different RNA molecules. 

Using the protocols described herein, regions 
have been identified within the gp41 (the transmembrane 
part of HIV-l env) coding area and at the post -env 3' 
region of HIV-l which contain INS. The elimination of INS 
from gag, pol and env regions will allow the expression of 
high levels of authentic HIV-l structural proteins in the 
absence of the Rev regulatory factor of HIV-l. The 
mutated coding sequences can be incorporated into 
appropriate gene transfer vectors which may allow the 
targeting of specific cells and/or more efficient gene 
transfer. Alternatively, the mutated coding sequences can 
be used for direct expression in human or other cells in 
vitro or in vivo with the goal being the production of 
high protein levels and the generation of a strong immune 
response. The ultimate goal in either case is subsequent 
protection from HIV infection and disease. 

The described experiments demonstrate that the 
inhibitory/instability sequences are required to prevent 
HIV-l expression. This block to the expression of viral 
structural proteins can be overcome by the Rev-RRE 
interaction. In the absence of INS, HIV-l expression 
would be similar to simpler retroviruses and would not 
require Rev. Thus, the INS is a necessary component of 
Rev regulation. Sequence comparisons suggest that the INS 
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element identified here is conserved in all HIV-1 
isolates, although this has not been verified 
experimentally. The majority (22 of 28) of the mutated 
nucleotides in gag are conserved in all HIV-1 isolates, 
while 22 of 28 are conserved also in HIV- 2 (G. Myers, et 
5 al . , Eds . Human retroviruses and AIDS. A compilation and 
analysis of nucleic acid and amino acid sequences (Los 
Alamos National Laboratory, Los Alamos, New Mexico, 1991), ; 
incorporated herein by reference) . Several lines of 
evidence indicate that all lentiviruses and other complex 

10 retroviruses such as the HTLV group contain similar INS 
regulatory elements. Strong INS elements have been 
identified in the gag region of HTLV- I and SIV (manuscript 
in preparation) . This suggests that INS are important 
regulatory elements, and may be responsible for some of 

15 the biological characteristics of the complex 

retroviruses. The presence of INS in SIV and HTLV- 1 
suggests that these elements are conserved among complex 
retroviruses. Since INS inhibit expression, it must be 
concluded that their presence is advantageous to the 

20 virus, otherwise they would be rapidly eliminated by 
mutations . 

The observations that the inhibitory/instability 
sequences act in the absence of any other viral proteins 
and that they can be inactivated by mutagenesis suggest 

25 that these elements may be targets for the binding of 

cellular factors that interact with the mRNA and inhibit 
post transcriptional steps of gene expression. The 
interaction of HIV-1 mRNAs with such factors may cause 
nuclear retention, resulting in either further splicing or 

30 rapid degradation of the mRNAs. It has been proposed that 
components of the splicing machinery interact with splice 
sites in HIV-1 mRNAs and modulated mRNA expression (A. 
Cochrane et al., J. Virol.* 65 :5305-5313 (1991); D. Chang 
and P. Sharp, Cell 5£:789-795 (1989); X. Lu et al., Proc. 

35 Natl. Acad. Sci. USA £7:7598-7602 (1990)). However, it is 
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° not likely that the inhibitory/instability elements 
described here are functional 5' or 3' splice sites. 
Thorough mapping of HIV-1 splice sites performed by 
several laboratories using the Reverse Transcriptase -PCR 
technique failed to detect any splice sites within gag (S. 
5 Schwartz et al., J. Virol. 64:2519-2529 (1990); J. 

Guatelli et al., J. Virol. 64:4093-4098 (1990); E. D. 
Gerrett et al., J- Virol. £5:1653-1657 (1991); M. Robert-. 
Guroff et al. f J. Virol. 64:3391-3398 (1990) ; S. Schwartz 
et al., J. Virol. 64:5448-5456 (1990); S. Schwartz et 

10 al., Virology 183:677-686 (1991)). The suggestions that 
Rev may act by dissociating unspliced mRNA from the 
splicesomes (D. Chang and P. Sharp, Cell 59:789-795 
(1989)) or by inhibiting splicing (J. Kjems et al . , Cell 
£7:169-178 (1991)) are not easily reconciled with the 

15 knowledge that all retroviruses produce structural 

proteins from mRNAs that contain unutilized splice sites. 
Splicing of all retroviral mRNAs, including HIV-1 mRNAs in 
the absence of Rev, is inefficient compared to splicing of 
c cellular mRNAs (J. Kjems et al.. Cell 67:169-178 (1991); 

20 a. Krainer et al., Gene Dev. 4:1158-1171 (1990); R. Katz 
and A. Skalka, Mol. Cell. Biol. 10:696-704 (1990); C. 
Stoltzfus and S. Fogarty, J. Virol. 63:1669-1676 (1989)). 
The majority of the retroviruses do not produce Rev- like 
proteins, yet they efficiently express proteins from 

25 partially spliced mRNAs, suggesting that inhibition of 
expression by unutilized splice sites is not a general 
property of retroviruses. Experiments using constructs 
expressing mutated HIV-1 gag and env mRNAs lacking 
functional splice sites showed that only low levels of 

30 these mRNAs accumulated in the absence of Rev and that 
their expression was Rev-dependent (M. Emerman et al., 
Cell 57:1155-1165 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA 86:1495-1499 (1989); M. Malim et al.. 
Nature (London) 318:254-257 (1989)). This led to the 

35 conclusion that Rev acts independently of splicing (B. 
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Felber et al . , Proc. Natl. Acad. Sci. USA 8,6:1495-1499 
(1989); M. Malim et al., Nature (London) 338.:254-257 
(1989)) and to the proposal that inhibitory/instability 
elements other than splice sites are present on HIV-1 

mRNAs (C. Rosen et al., Proc* Natl. Acad. Sci. USA 
5 85:2071-2075 (1988); M. Hadzopoulou-Cladaras , et al . , J. 

Virol. 63:1265-1274 (1989); B. Felber et al., Proc. Natl. 

Acad. Sci. USA 86: 1495-1499 * (1989) ) . 

Construction of the Gag Expression Plasmids 

10 Plasmid pl7R has been described as pNLl7R (S. 

Schwartz et al., J. Virol. £6:150-159 (1992)). Plasmid 
pl7 was generated from pl7R by digestion with restriction 
enzyme Asp718 followed by religation. This procedure 
deleted the RRE and HIV-1 sequences spanning nt 8021-8561 

15 upstream of the 3' LTR. To generate mutants of pl7 gae , the 
p!7 gas coding sequence was subcloned into a modified 
pBLUESCRIPT vector (Stratagene) and generated single 
stranded uracil -containing DNA. Site-directed mutagenesis 
was performed as described (T. Kunkel, Proc. Natl. Acad. 

20 Sci. USA 82:488-492 (1985); S. Schwartz et al . , Mol. Cell 
Biol. 12:207-219 (1992)). Clones containing the 
appropriate mutations were selected by sequencing of 
double- stranded DNA. To generate plasmid p!9R, plasmid 
pl7R was first digested with BssHII and EcoRI, thereby 

25 deleting the entire pl7 Ele coding sequence, six nucleotides 
upstream of the pl7 w AUG and nine nucleotides of linker 
sequences 3' of the pl7 8ie stop codon. The pl7 w coding 
sequence in pl7R was replaced by a PCR-amplif ied DNA 
fragment containing the RSV pl9 ff * coding sequence (R. 

30 

Weiss et al., RNA Tumor Viruses. Molecular Biology of 
Tumor Viruses (Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1985)). This fragment contained eight 
nucleotides upstream of the RSV gag AUG and the pl9 w 
coding sequence immediately followed by a translational 

35 

stop codon. The RSV gag fragment was derived form the 
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infectious RSV proviral clone S-RA (R. Weiss et al., RNA 
Tumor Viruses. Molecular Biology of Tumor Viruses (Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York, 
1985)) . pl9 was derived from pl9R by excising an Asp 718 
fragment containing the RRE and 3' HIV-1 sequences 
5 spanning nt 8021-8561. 

Transfection of HLtat Cells With Gag Expression Plasmids ; 
HLtat cells (S. Schwartz et al . , J. Virol. 
64:2519-2529 (1990)) were transfected using the calcium 

10 coprecipitation technique (F. Graham et al . and A. Van der 
Eb, Virology 52:456-460 (1973)) as described (B. Felber et 
al., Proc. Natl. Acad. Sci. USA £6:1495-1499 (1989)), 
using 5 /zg of pl7, p!7R, pl7M1234, pl9, or pl9R in the 
absence (-) or presence ( + ) of 2 /ig of the Rev- expressing 

15 plasmid pL3crev (B. Felber et al., Proc. Natl. Acad. Sci. 
USA 86:1495-1499 (1989)). The total amount of DNA in 
transf ections was adjusted to 17 /zg per 0.5 ml of 
precipitate per 60 mm plate using pUC19 carrier DNA. 
Cells were harvested 20 h after transfected and cell 

20 extracts were subjected to electrophoresis on 12.5% 
denaturing polyacrylamide gels and analyzed by 
immunoblotting using either human HIV-1 patient serum 
(Scripps) or a rabbit anti-pl9 ea8 serum. pRSV-lucif erase 
(J. de Wet et al. f Mol. Cell. Biol. 7:725-737 (1987)) that 

25 contains the firefly luciferase gene linked to the RSV LTR 
promoter, was used as an internal standard to control for 
transfection efficiency and was quantitated as described 
(L. Solomin et al . , J. Virol. £4:6010-6017 (1990)). The 
results are set forth in Fig. 2. 
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Northern Blot Analysis 
HLtat cells were transfected as described above 
and harvested 20 h post transfection. Total RNA was 
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prepared by the heparin/DNase method (Z. Krawczyk and C. 
Wu, Anal. Biochem. 165:20-27 (1987) ) # and 20 fig of total 
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RNA was subjected to northern blot analysis as described 
(M. Hadzopoulou-Cladaras et al . , J. Virol* £3:1265-1274 
(1989)). The filters were hybridized to a nick- translated 
PCR-amplif ied DNA fragment spanning nt 8304-9008 in the 
HIV-1 3' LTR. The results are set forth in Fig. 3. 

5 

EXAMPLE 2 
HIV-1 ENV GENE 
Fragments of the env gene were inserted into 
vectors pl9 or p37M1234 and the expression of the 

10 resulting plasmids were analyzed by transf ections into 
HLtat cells. It was found that several fragments 
inhibited protein expression. One of the strong INS 
identified was in the fragment containing nucleotides 
8206-8561 ("fragment [8206-8561]"). To eliminate this 

15 INS, the following oligonucleotides were synthesized and 
used in mutagenesis experiments as specified supra . The 
fragment was derived from the molecular clone pNL43, which 
is almost identical to HXB2 . The numbering system used 
herein follows the numbering of molecular clone HXB2 

20 throughout. The synthesized oligonucleotides follow the 
pNL4 3 s eguence . 

The oligonucleotides which were used to 
mutagenize fragment [8206-8561], and which made changes in 
the env coding region between nucleotides 8210-8555 (the 

25 letters in lower case indicate mutated nucleotides) were: 

#1: 

8194-8261 

GAATAGTGCTGTTAACcTcCTgAAcGCtACcGCtATcGCcGTgGCgGAaGGaACcGAc 
30 AGGGTTATAG (SEQ ID NO: 10) 

#2 

8262-8323 

AAGTATTACAAGCcGCcTAccGcGCcATcaGaCAtATcCCccGccGcATccGcCAGGG 
35 CTTG (SEQ ID NO: 11) 



* #3 

8335-8392 

GCTATAAGATGGGcGGtAAaTGGagcAAgtcctccGTcATcGGcTGGCCTGCTGTAAG 
(SEQ ID NO: 12) 

5 #4 

8393-8450 

GGAAAGAATGcGcaGgGCcGAaCCcGCcGCcGAcGGaGTtGGcGCcGTATCTCGAGAC 
(SEQ ID NO: 13) 

10 #5 

8451-8512 

CTAGAAAAACAcGGcGCcATtACctcctCtAAcACcGCcGCcAAtAAcGCcGCTTGTG 
CCTG (SEQ ID NO: 14) 



15 #6 

8513-8572 

GCTAGAAGCACAgGAaGAaGAgGAaGTcGGcTTcCCcGTtACcCCTCAGGTACCTTTA 

e: 

AG (SEQ ID NO: 15) 

20 The expression of env was increased by the 

elimination of the INS in fragment [8206-8561] as 
determined by analysis of both mRNA and protein. 

To further characterize in detail the INS in 
HIV-l env, the coding region of env was divided into 

25 different fragments, which were produced by PCR using 
appropriate synthetic oligonucleotides, and cloned in 
vector p37Ml-10D. This vector was produced from p37M1234 
by additional mutagenesis as described above. After 
introduction into human cells, vector p37Ml-10D produces 

30 high levels of p37 w protein. Any strong INS element will 
inhibit the expression of gag if ligated in the same 
vector. The summary of the env fragments used is shown in 
Figure 11. The results of these experiments show that, 
like in HIV-l gag, there exist multiple regions inhibiting 

35 expression in HIV-l env, and combinations of such regions 
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° result in additive or synergistic inhibition. For 

example, while fragments 1, 2, or 3 individually inhibit 
expression by 2-6 fold, the combination of these fragments 
inhibits expression by 30 fold. Based on these results, 
additional mutant oligonucleotides have been synthesized 

5 for the correction of eny INS. These oligonucleotides 
have been introduced in the expression vectors for HIV-1 
env pl20pA and p!20R270 (see Fig. 7) for the development 
of Rev- independent HIV-1 env expression plasmids as 
discussed in detail below, 

0 

l. The mRNAs for gplSO and for the 

extracellular domain (gpl20) are defective 
and their expression depends on the 
presence of RRE in cis and Rev in trans 



15 



20 



25 



30 



35 



1.1 Positive and Negative Determinants for 
env mRNA Expression of HIV 



Previous experiments on the identification and 
characterization of the env expressing cDNAs had 
demonstrated that Env is produced from mRNAs that contain 
exon 4AE, 4BE, or 5E. (Schwartz et al . , J. Virol. 
64:5448-5456 (1990); Schwartz et al . , Mol . Cell. Biol. 
12:207-219 (1992). All constructs generated to study the 
determinants of env expression are derived from pNL15E. 
This plasmid contains the HIV-1 LTR promoter, the complete 
env cDNA 15E, and the HIV 3' LTR including the 
polyadenylation signal (Schwartz, et al. J. Virol, 
64:5448-5456 (1990) (Fig. 7). pNL15E was generated from 
the molecular clone pNL4-3 (pNL4-3 is identical to pNL43 
herein) (Adachi et al., J. Virol. 59:284-291 (1986) and 
lacks the splice acceptor site for exon 6D, which was used 
to generate the tev mRNA (Benko et al . , J. Virol. 64:2505- 
2518 (1990) . The Env expression plasmids were transfected 
in the presence or absence of the Rev- expressing plasmid 
pL3crev (Felber et al . , J. Virol. 64:3734-3741 (1990) into 
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HLtat cells (Schwartz et al., J* Virol. 64:2519-2529 
(1990) , which constitutively express Tat (one-exon Tat) . 
One day later, the cells were harvested for analyses of 
RNA and protein. Total RNA was extracted and analyzed on 
Northern blots. Protein production was measured by 
5 Western blots to detect cell -associated Env. In the 
absence of Rev, NL15E mRNA* was efficiently spliced and 
produced Nef ; in the presence of Rev, most of the RNA 
remained unspliced and produces the Env precursor gpl60, 
which is processed to gpl20, the secreted portion of the 

10 precursor and gp41. 

To allow for the effects of INS to be 
distinguished and studied separately from splicing, splice 
sites known to exist within some of the fragments used 
were eliminated as discussed below. Analysis of the 

15 resulting expression vectors included size determination 
of the produced mRNA, providing the verification that 
splicing does not interfere with the interpretation of the 
data. 

2Q 1.2 Env expression is Rev- dependent also 

in the absence of functional splice 
sites 

To study the effect of splicing on env 
expression, the splice donor at nt 5592 was removed by 

25 site-directed mutagenesis (changing GCAGTA to GaAtTc, and 
thus introducing an EcoRI site) , which resulted in plasmid 
15ESD- (Fig. 7) . The mRNA from this construct was 
efficiently spliced and produced a small mRNA encoding Nef 
(Fig. 8) . Sequence analysis revealed that this spliced 

30 mRNA was generated by the use of an alternative splice 
donor located at nt 5605 (TACATgtaatg) and the common 
splice acceptor site at nt 7925 : . In contrast to published 
work' (Lu et al . , Proc. Natl. Acad. Sci. USA 87:7598-7602 
(1990) , expression of Env from this mutant depended on 

35 Rev. Next, the splice acceptor site was mutated at nt 

7925. Since previous cDNA cloning had revealed that in 



- 64 - 

addition to the splice acceptor site at nt 7925 there are 
two additional splice acceptor sites at nt 7897 and nt 
7901 (Schwartz, et al . J- Virol. 64:2519-2529 (1990), this 
region of 43 bp encompassing nt 7884 to nt 792 6 was 
removed. This resulted in plSEDSS (Fig. 7). Northern 
5 blot analysis of mRNA from HLtat cells transfected with 
this construct confirmed that the 15EDSS mRNA is not 
spliced (Fig- 8B) . Although all functional splice sites 
have been removed from plSEDSS , Rev is still required for 
Env production (Fig. 8A) . Taken together with data 

10 obtained by studying gag expression, these results suggest 
that the presence of inefficiently used splice sites is 
not the primary determinant for Rev- dependent Env 
expression. It is known that at least two unused splice 
sites are present in this mRNA (the alternative splice 

15 donor at nt 5605 and the splice donor of exon 6D at nt 
6269) . Therefore, it cannot be ruled out that initial 
spliceosome formation can occur, which does not lead to 
the execution of splicing. It is possible that this is 
sufficient to retain the mRNA in the nucleus and, since no 

20 splicing occurs, that this would lead to degradation of 
the mRNA. Alternatively, it is possible that 
splice- site- independent RNA elements similar to those 
identified within the gag/pol region (INS) are responsible 
for the Rev dependency (Schwartz et al . , J. Virol. 

25 66:7176-7182 (1992); Schwartz et al., J. Virol. 66:150- 
159(1992). 

1.3 Identification of negative elements 
within qp!2 0 mRNA 

30 to distinguish between these possibilities, a 

series of constructs were designed that allowed the 
determination of the location of such INS elements. 
First, a stop codon followed by the restriction sites for 
Nrul and Mlul was introduced at the cleavage site between 

35 the extracellular gpl20 and the transmembrane protein gp41 
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at nt 7301 in plasmid NL15EDSS, resulting in pl20DSS (Fig. 
7) . Immunoprecipitation of gpl20 from the medium of cells 
transfected with pl20DSS confirmed the production of high 
levels of gpl20 only in the presence of Rev (Fig. 9B) . 
The release of gpl20 is very efficient, since only barely 
5 detectable amounts remain associated with the cells (data 
not shown) . This finding rules out the possibility that 
the translation of the gp4l portion of the env cDNA is 
responsible for the defect in env expression. Next, the 
region 3' of the stop codon of gpl20 (consisting of gp41, 

10 including the RRE and 3' LTR) with the SV40 

polyadenylation signal (Fig. 7) was replaced. This 
construct, pl20pA, produced very low levels of gpl20 in 
the absence of Rev (Fig. 9B) . Background levels of Env 
were produced from pl20DR (Fig. 7) , which was generated 

15 from pBS120DSS by removing the 5' portion of gp41 

including the RRE (Mlul to Hpal at nt 8200) (Fig. 9B) . 
These results demonstrate the presence of a major INS - like 
sequence within the gpl20 portion. To study the effect of 
Rev on this mRNA, different RREs (RRE330, RRE270, and 

20 RRED345 (Solomin et al . , J. Virol. 64:6010-6017 (1990) 
were inserted into pl20pA downstream of the gpl20 stop 
codon, resulting in pl20R330, pl20R270, and pl20RD345, 
respectively (Fig. 7) . Immunoprecipitations demonstrated 
that the presence of Rev in trans and the RRE in cis could 

25 rescue the defect in the gpl20 expression plasmid. High 
levels of gpl20 were produced from pl20R330 (data not 
shown), pl20R27Q., and pl20RD345 (Fig. 9B) in the presence 
of Rev. 

Northern blot analysis (Fig. 8A). confirmed the 
30 protein data. The presence of Rev resulted in the 

accumulation of high levels of mRNA produced by pBSl20DSS, 
pl20R270, and pl20RD345. Low but detectable levels of RNA 
were produced from pl20DpA and pl20DR. 
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° 2. Identification of INS elements located 

within the env mRNA regions using two 
strategies 

To identify elements that have a down regulatory 
effect in vivo, fragments of env cDNA were inserted into 
5 two different test expression vectors, pl9 and p37Ml-10D. 
These vectors contain a strong promoter for rapid 
detection of the gene product, such as the HIV-1 LTR in 
the presence of Tat, and an indicator gene that is 
expressed at high levels and can easily be assayed such as 

10 pi9 plg of RSV or the mutated p37««« gene of HIV-1 

(p37Ml-10D), neither of which contains any known INS-like 
elements. Expression vector pl9 contains the HIV-1 LTR 
promoter, the RSV pl9 ga * matrix gene, and HIV-1 sequences 
starting at Kpnl (nt 8561) including the complete 3' LTR 

15 (Schwartz, et al . , J, Virol. 66:7176-7182 (1992). Upon 
transfection into HLtat cells high levels of pl9gag are 
constitutively produced and are visualized on Western 
blots. Expression vector p37Ml-10D contains the HIV-1 LTR 
promoter, the mutant p37gag (Ml-10) , and the 3' portion of 

20 the virus starting at Kpnl (nt 8561) . Upon transfection 
into HLtat cells this plasmid constitutively produces 
p3 7 gae that can be quant itated by the HIV-l p24 ga * antigen 
capture assay. 

25 2.1 Identification of INS elements using 

the RSV aaa expression vector 

INS elements within the gp41 and gp!20 portions 
were identified. To this end, the vector pl9 was used and 
the following fragments (Fig. 10) were inserted: (A) nt 

30 7684 to 7959; (B) nt 7684 to 7884 and nt 7927 to 7959; 

this is similar to fragment A but has the region of the 
splice acceptors 7A, 7B and 7 deleted; (C) nt 7595 to 7884 
and rit 7927 to 7959, having the splice sites deleted as in 
B; (D) nt 7939 to 8066; (E) nt 7939 to 8416; (F) nt 8200 

35 to 8561 (Hpal-Kpnl) ; (G) nt 7266 to 7595 containing the 
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intact RRE; (H) nt 5523 to 6190, having the splice donor 

SD5 deleted. 

Fragments A, B, and D did not affect Gag" 

expression, whereas fragment G (RRE) decreased gag . 

expression approximately 5x. Fragment C, E, and H lowered 
5 Gag expression by about 10 -20 -fold indicating the presence 

of INS elements. 

Interestingly, it was observed that the 

insertion of element F spanning 350 bp in plasmid pl9 

abolished production of Gag, indicating the presence of a 
10 strong INS within this element. The presence of the RRE 

in cis and Rev in trans resulted in production of high 

levels of RSV pl9 fag . Fragment F also had a smaller 

downregulatory effect on the expression of the 

INS-corrected pll*** of HIV-1 (pl7M!234) . These 
15 experiments revealed the presence of multiple elements 

located within the env mRNA that cause inhibition of pl9 eag 

expression. 

2.2 Elimination of the INS within 
fragment F 

20 

Six synthetic oligonucleotides (Table 3) were 
generated that introduced 103 point mutations within this 
region of 330 nt without affecting the amino acid 
composition of Env. The mutated fragment F was tested in 

25 pig to verify that the INS elements are destroyed. The 
introduction of the mutations within oligo#l only 
marginally affected the expression of pl9 8 *, whereas the 
presence of all oligos (#1 to #6) completely inactivated 
the INS effect of fragment F. This is another example 

30 that more than one region within an INS element needed to 
by mutagnenized to eliminate the INS effect. 

It is noteworthy that this INS element is 
present in all the multiply spliced Rev- independent mRNAs, 
such as tat, rev and nef . Experiments were performed to 

3 ^ define the function of fragment F within the class of the 



% 
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° small mRNAs by removing this fragment from the tat cDNA. 
In the context of this mRNA, this, element confers only a 
weak INS effect (3-5- fold inhibition), which suggests 
that inhibition of expression in env mRNA may require the 
presence of at least two distinct elements. These results 
5 suggested that the INS effect within env is based on 
multiple interacting components. Alternatively, the 
relative location and interactions among multiple INS 
components may be important for the magnitude of the INS 
effect. Therefore, more than one type of analysis in 
10 different vectors may be necessary for the identification 
and elimination of INS. 



different consecutive fragments. These fragments and 
combinations of thereof were PCR-amplif ied using oligos as 
indicated in Fig. 11 and inserted downstream of the 
mutated p3 7 w gene in p3 7Ml-10D. The plasmids were 

20 transfected into HLtat cells that were harvested the next 
day and analyzed for p24 gae expression. Fig. 11 shows that 
the presence of fragments 2, 3, 5 as well as the 
combination 1+2+3 lowered gag expression substantially. 
Different oligos (Table 4) were synthesized that change 

25 the AT- rich domains including the three AATAAA elements 
located within the env coding region by changing the 
nucleotide but not the amino acid composition of Env. In 
a first approach, these oligos 1-19 are being introduced 
into plasmid pl2 0R270 with the goal or producing gpl20 in 

30 a Rev- independent manner. Oligonucleotides such as oligos 
20-26 will then be introduced into the gp41 portion, the 
two env portions combined and the complete gp!60 expressed 
in a Rev- independent manner. 



2.3. Identification of INS elements using 
p37Ml-10D expression vector 



15 



The env coding region was subdivided into 



35 
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° EXAMPLE 3 

PROTO- ONCOGENE C-FOS 
Fragments of the fos gene were inserted Into the 
vector pl9 and the expression of the resulting plasmids 
were analyzed by transf ections into HLtat cells. It was 
5 found that several fragments inhibited protein expression. 
A strong INS was identified in the fragment containing 
nucleotides 3328-3450 ("fragment [3328-3450]") 
(nucleotides of the fos gene are numbered according to 
Genebank sequence entry HUMCFOT, ACCESSION # V01512) . In 
10 addition, a weaker element was identified in the coding 
region. 

To eliminate these INS the following 
oligonucleotides were synthesized and are used in 
mutagenesis experiments as specified supra. 

15 To eliminate the INS in the fos non- coding 

region, the following oligonucleotides, which make changes 
in the fos non- coding region between nucleotides [3328- 
3450] (the letters in lower case indicate mutated 
nucleotides) , were synthesized and are used to mutagenize - 

20 fragment [3328-3450] : mutagenesis experiments as specified 
supra : 

#1: 

3349-3391 

25 TGAAAACGTTcgcaTGTGTcgcTAcgTTgcTTAcTAAGATGGA (SEQ ID NO: 
16) 

#2: 

3392-3434 

30 TTCTCAGATAc cTAg cTTcaTATTg c cTTaTTgTCTACCTTGA (SEQ ID NO: 
17) 



35 



These oligonucleotides are used to mutagenize 
fos fragment [3328-3450] inserted into vectors pl9, 
pl7M1234 or p3 7M1234 and the expression of the resulting 
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plasmids are analyzed after transfection into HLtat cells. 

The expression of fos is expected to be 
increased by the elimination of this INS region. 

To further define and eliminate the INS elements 
in the coding region, additional longer fragments of fos 
5 are introduced into vector p37M1234. The INS element in 
the coding region is first mapped more precisely using 
this expression vector and is then corrected using the 
following oligonucleotides : 

10 #1 

2721-2770 

GCCCTGTGAGtaGGCActGAAGGacAGcCAtaCGtaACat ACAAGTGCCA ( SEQ ID 
NO: 18) 

15 #2 

2670-2720 

AGCAGCAGCAATGAaCCTagtagcGAtagcCTgAGtagcCCtACGCTGCTG (SEQ 
ID NO: 19) 

20 #3 

2620-2669 

ACCCCGAGGCaGAtagCTTtCCatccTGcGCtGCcGCtCACCGCAAGGGC (SEQ ID 
NO: i0) 

25 #4 

2502-2562 

CTG CACAGTGGaagCCTcGGaATGGGcCCtATGGCtACcGAatTGGAaCCaCTGTGCA 
CTC (SEQ ID NO: 21) 

30 The expression of fos is expected to be 

increased by the elimination of this INS region. 
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* EXAMPLE 4 

HIV-1 POL GENE 
Vector p37Ml234 was used to eliminate an" 
inhibitory/instability sequence from the pol gene of HIV-1 
which had been characterized by AW Cochrane et al . , 
5 "Identification and characterization of intragenic 

sequences which repress human immunodeficiency virus 
structural gene expression" , J. Virol. 65:5305-5313 
(1991) . These investigators suggested that a region in 
pol (HIV nucleotides 3792-4052) , termed CRS, was important 
10 for inhibition. A larger fragment spanning this region, 
which contained nucleotides 3700-4194, was inserted into 
the vector p37M1234 and its effects on the expression of 
p3 7gag from the resulting plasmid (plasmid p37M1234RCRS) 
(see Fig. 12) was analyzed after transfection into HLtat 
15 cells. 

Severe inhibition of gag expression (10 fold, 
see Fig. 13) was observed. 

In an effort to eliminate this INS, the 
following oligonucleotides were synthesized (the letters 

20 in lower case indicated mutated nucleotides) and used in 
mutagenesis experiments . 

First, it was observed that one AUUUA potential 
instability element was within the INS region. This was 
eliminated by mutagenesis using oligonucleotide MIOpol and 

25 resulted in plasmid p37M1234RCRSP10 . The expression of 

gag from this plasmid was not improved, demonstrating that 
elimination of the AUUUA element alone did not eliminate 
the INS. See Fig. 12. Therefore, additional mutagenesis 
was performed and it was shown that a combination of 

30 mutations introduced in plasmid p37M1234RCRS was necessary 
and sufficient to produce high levels of gag proteins, 
which were similar to the plasmid lacking CRS. The 
mutations necessary for the elimination of the INS are 
shown in Fig. 13. 

35 The above results demonstrate that HIV-1 pol 
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° contains INS elements that can be detected and eliminated 

with the techniques described. 

These results also suggest that regions outside 

of the minimal inhibitory region in CRS as defined by A.W. 

Cochrane et al . , supra . influence the levels of 
5 expression. These results suggest that the RNA structure 

of the region is important tor the inhibition of 

expression. 

Table 1 

10 Correspondence between Sequence 

Identification Numbers and Nucleotides in Figure 4 
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Secruence ID Nos . 


Fiaure 4 












SEQ 


ID 


NO: 


1 


nucleotides 


336 


-731 








SEQ 


ID 


NO: 


2 


nucleotides 


402 


-452 
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ID 


NO: 


3 


nucleotides 


536 


-583, 


above 


line 
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ID 


NO: 


4 


nucleotides 


585 


-634, 


above 


line 




SEQ 


ID 


NO: 


5 


nucleotides 


654 


-703, 


above 


line 




SEQ 


ID 


NO: 


6 


nucleotides 


402 


-452, 


below 


line 


(Ml) 


SEQ 


ID 


NO: 


7 


nucleotides 


536 


-583, 


below 


line 


(M2) 


SEQ 


ID 


NO: 


8 


nucleotides 


585 


-634, 


below 


line 


(M3) 


SEQ 


ID 


NO: 


9 


nucleotides 


654 


-703, 


below 


line 


(M4) 



2o Table 2 

Synthetic oligonucleotides used 
in the mutagenesis of HIV-1 aaa and pol regions 

The upper sequence is the wild- type HIV-l as 
found in HIV^^ while the bottom is the mutant 
25 oligonucleotide sequence. The location of the sequence is 
indicated in parentheses. 



M5gag (778-824) 

CACCTAGAACTTTAAATGCATGGGTAAAAGTAGTAGAAGAGAAGGCT (SEQ ID 
NO: 22) 

30 XXXXX XXX 

CACCTAGAACccTgAAcGCcTGGGTgAAgGTgGTAGAAGAGAAGGCT (SEQ ID 
NO: 23) 
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M6gag (871-915) 

CCACCCCACAAGATTTAAACACCATGCTAAACACAGTGGGGGGAC (SEQ ID NO: 
24) 

xxxxxxxx 

CCACCCCACAgGAccTgAACACgATGtTgAACACcGTGGGGGGAC (SEQ ID NO: 
25) 



M7gag (1105-1139) 

CAGTAGGAGAAATTTATAAAAGATGGATAATCCTG (SEQ ID NO: 26) 

X X X X X 
CAGTAGGAGAgATcTAcAAGAGgTGGATAATCCTG (SEQ ID NO: 27) 



M8gag (1140-1175) 

GGATTAAATAAAATAGTAAGAATGTATAGCCCTACC (SEQ ID NO: 28) 

X X X X X X 
GGATTgAAcAAgATcGTgAGgATGTATAGCCCTACC (SEQ ID NO: 29) 



M9gag (1228-1268) 

ACCGGTTCTATAAAACTCTAAGAGCCGAGCAAGCTTCACAG (SEQ ID NO: 30) 
15 X X X XX X X 

ACCGGTTCTAcAAgACcCTgcGgGCtGAGCAAGCTTCACAG (SEQ ID NO: 31) 



eMIOgag (1321-1364) 
ATTGTAAGACTATTTTAAAAGCATTGGGACCAGCGGCTACACTA (SEQ ID NO: 
32) 

X XX X X XX X X 
ATTGTAAGACcATcCTgAAgGCtcTcGGcCCAGCGGCTACACTA (SEQ ID NO: 
33) 



Mllgag (1416-1466) 

AGAGTTTTGGCTGAAGCAATGAGCCAAGTAACAAATTCAGCTACCATAATG ( S EQ 
ID NO: 34) 

25 XXX XXXXXX 

AGAGTTTTGGCcGAgGCgATGAGCCAgGTgACgAAcTCgGCgACCATAATG (SEQ 
ID NO: 35) 



M12gag (1470-1520) 

CAGAGAGGCAATTTTAGGAACCAAAGAAAGATTGTTAAGTGTTTCAATTGT ( SEQ 
ID NO: 36) 

X XX XX X XX 

CAGAGAGGCAAcTTc cGGAACCAg cGgAAGATcGTcAAGTGTTTCAATTGT (SEQ 
ID NO: 37) 
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M13gag (1527-1574) 

GAAGGGCACACAGCCAGAAATTGCAGGGCCCCTAGGAAAAAGGGCTGT (SEQ ID 
NO: 38) 

XXX XX X 

GAAGGGCACACcGCCAGgAAcTGCcGGGCCCCccGGAAgAAGGGCTGT (SEQ ID 
NO: 39) 



M14gag (1581-1631) 

TGTGGAAAGGAAGGACACCAAATGAAAGATTGTACTGAGAGACAGGCTAAT (SEQ 
ID NO: 40) 

XX X XXXXXX 

TGTGGAAAGGAgGGgCACCAgATGAAgGAcTGcACgGAGcGgCAGGCTAAT (SEQ 
ID NO: 41) 

10 

MOpol (1823-1879) (K to R difference introduced) 
CCCCTCGTCACAATAAAGATAGGGGGGCAACTAAAGGAAGCTCTATTAGATACAGGAG 

(SEQ ID NO: 42) 

XXX X X XX X 

CCCCTCGTCACAgTAAgGATcGGGGGGCAACTcAAGGAAGCgCTgcTcGATACAGGAG 

(SEQ ID NO: 43) 

15 

Mlpol (1936-1987) 

G ATAGGGGGAATTGG AGGTTTTATCAAAGTAAGACAGTATGATCAGATACTC ( S EQ 
ID NO: 44) 

XXXXX XXX XX 

GATAGGGGGgATcGGgGGcTTcATCAAgGTgAGgCAGTAcGAcCAGATACTC (SEQ 
ID NO: 45) 

20 

M2pol (2105-2152) 

CCTATTGAGACTGTACCAGTAAAATTAAAGCCAGGAATGGATGGCCCA (SEQ ID 
NO: 46) 

XXXXX X XX 
CCTATTGAGACgGTgCCcGTgAAgTTgAAGCCgGGgATGGATGGCCCA (SEQ ID 
25 NO: 47) 



M3.2pol (2162-2216) 

CAATGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAAATTTGTACAGAGA 
(SEQ ID NO: 48) 

X XXXXX X X 

CAATGGCCATTGACgGAAGAgAAgATcAAgGCcTTAGTcGAAATcTGTACAGAGA 
30 (SEQ ID NO: 49) 
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M4pol (2465-2515) 

TTCAGGAAGTATACTGCATTTACCATACCTAGTATAAACAATGAGACACCA ( S EQ 
ID NO: 50) 

XXXX XXXX X 

TTCAGGAAGTAcACgGCgTTcACCATcCCgAGcATcAACAAcGAGACACCA ( SEQ 
ID NO: 51) 



M5pol (2873-2921) 

TTAGTGGGGAAATTGAATTGGGCAAGTCAGATTTACCCAGGGATTAAAG (SEQ ID 
NO: 52) 

XX X X X X X 

TTAGTGGGGAAggTGAAcTGGGCgAGcCAGATcTACCCgGGGATTAAAG (SEQ ID 
NO: 53) 

10 

M6pol (3098-3150) 

GGCCAATGGACATATCAAATTTATCAAGAGCCATTTAAAAATCTGAAAACAGG ( S EQ 
ID NO: 54) 

XXXXXX XXXX 
GGCCAATGGACgTAcCAgATcTAcCAgGAGCCgTTcAAgAAcCTGAAAACAGG (SEQ 
ID NO: 55) 

15 

M7pol (3242-3290) 

TGGGGAAAGACTCCTAAATTTAAACTGCCCATACAAAAGGAAACATGGG (SEQ ID 
NO: 56) 

XXXXX XX X 

TGGGGAAAGACgCCgAAgTTcAAgCTGCCCATcCAgAAGGAgACATGGG (SEQ ID 
NO: 57) 

20 

M8pol (3520-3569) 

GAAGACTGAGTTACAAGCAATTTATCTAGCTTTGCAGGATTCGGGATTAG (SEQ ID 
NO: 58) 

XXXXXXXXX X 
GAAGACTGAGcTgCAgGCgATcTAcCTgGCgcTGCAGGAcTCGGGATTAG (SEQ ID 
25 NO: 59) 



M8-2pol (3643-3698) 

GTTAGTCAATCAAATAATAGAGCAGTTAATAAAAAAGGAAAAGGTCTATCTGGCAT 
(SEQ ID NO: 60) 

X XX XXXX X X 

GTTAGTCAAcCAAATcATcGAGCAGcTgATcAAgAAGGAgAAGGTgTATCTGGCAT 
30 (SEQ ID NO: 61) 
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M9pol (3749-3800) 

GTCAGTGCTGGAATCAGGAAAGTACTATTTTTAGATGGAATAGATAAGGCCC ( S EQ 
ID NO: 62) 

XX XX xxxxxx 

GTCAGTGCTGGgATCcGGAAgGTgCTATTccTgGAcGGgATcGATAAGGCCC ( SEQ 

ID NO: 63) 



M9.2pol (3806-3863) 

GAACATGAGAAATATCACAGTAATTGGAGAGCAATGGCTAGTGATTTTAACCTGCCAC 

(SEQ ID NO: 64) 

X X XXX X XXX xxxx 
GAACATGAGAAgTAcCACtCCAAcTGGcGcGCtATGGCcAGcGAcTTcAACCTGCCAC 

(SEQ ID NO: 65) 

10 

MIOpol (3950-4001) 

GGAATATGGCAACTAGATTGTACACATTTAGAAGGAAAAGTTATCCTGGTAG (SEQ 
ID NO: 66) 

xxxxxxxxxxxx 

GGAATATGGCAgCTgGAcTGcACgCAccTgGAgGGgAAgGTgATCCTGGTAG (SEQ 
ID NO: 67) 

15 

Mllpol (4031-4096) 

GCAGAAGTTATTCCAGCAGAAACAGGGCAGGAAACAGCATATTTTCTTTTAAAATTAG 
-CAGGAAGA (SEQ ID NO: 68) 

XXX X XXXXXXXX XX 

GCAGAAGTTATcCCtGCtGAAACtGGGCAGGAgACcGCcTAcTTcCTgcTcAAAcTcG 
-CAGGAAGA (SEQ ID NO: 69) 



M12pol (4097-4151) 

TGGCCAGTAAAAACAATACATACTGACAATGGCAGCAATTTCACCGGTGCTACGG 

(SEQ ID NO: 70) 

XXXXXX XX X X 

TGGCCAGTgAAgACgATcCAcACgGACAAcGGaAGCAAcTTCACtGGTGCTACGG 

(SEQ ID NO: 71) 



M13pol (4220-4271) 

GGAGTAGTAGAATCTATGAATAAAGAATTAAAGAAAATTATAGGACAGGTAA ( SEQ 
ID NO: 72) 

X XXXX XXX 

GGAGTAGTAGAATCcATGAAcAAgGAAcTgAAGAAgATcATcGGACAGGTAA ( SEQ 
30 ID NO: 73) 



M12pol-p (4097-4151) (indicates the sequence found in 
p37M1234RCRSP10+P12p 

TGGCCAGTAAAAACAATACAcACgGACAAcGGaAGCAAcTTCACtGGTGCTACGG 
(SEQ ID NO: 74) 
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Table 3 

Sequences of mutant oligos designed 
to eliminate the INS effect of f ragment F 

The six oligonucleotides used to eliminate the 
INS effect of fragment F (oligos #1 to #6) are set forth 
above in Example 2 (SEQ. ID. NOS. 10-15). 



Table 4 

Sequence of mutant oligos designed to 
10 destroy INS elements within the env coding region 

The wildtype (top) and the mutant oligo (below) 

of 26 different regions are shown. 

mutant oligos for env of HIV-1 : 

T c Ml (5834-5878) 46-mer 

CTTGGGATGTTGATGATCTGTAGTGCTACAGAAAAATTGTGGGTC (SEQ ID NO: 

75) 

X xxxxxxxx 

CTTGGGATGcTGATGATcTGcAGcGCcACcGAgAAgcTGTGGGTC (SEQ ID NO: 
76) 



20 M2 (5886-5908) 24-mer 

ATTATGGGGTACCTGTGTGGAAG (SEQ ID NO: 77) 
XXX 

ATTATGG cGTg CC cGTGTGGAAG (SEQ ID NO: 78) 
M3 (5920-5956) 38-mer 

CACTCTATTTTGTGCATCAGATGCTAAAGCATATGAT (SEQ ID NO: 79) 
25 XXXXXXX 

CACTCTATTcTGcGCcTCcGAcGCcAAgGCATATGAT (SEQ ID NO: 80) 

M4 (5957-5982) 27-mer 

ACAGAGGTACATAATGTTTGGGCCAC (SEQ ID NO: 81) 
X X X X 

ACAGAGGTgCAcAAcGTcTGGGCCAC (SEQ ID NO: 82) 

30 M5 (6006-6057) 53-mer 

CCAACCCACAAGAAGTAGTATTGGTAAATGTGACAGAAAATTTTAACATGTG (SEQ 

ID NO: 83) 

XXXXXX XX X X X X 

CCAACCCc CAgGAgGTgGTg cTGGTgAAcGTGACcGAgAAcTTc AACATGTG ( S EQ 
ID NO: 84) 
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M6 (6135-6179) 46-mer 

TAACCCCACTCTGTGTTAGTTTAAAGTGCACTGATTTGAAGAATG (SEQ ID NO: 
85) 

X X X XX X X XX 

TAACCCCcCTCTGcGTgAGccTgAAGTGCACcGAccTGAAGAATG (SEQ ID NO: 
86) 



5 M7 (6251-6280) 31-mer 

ATCAGCACAAGCATAAGAGGTAAGGTGCAG (SEQ ID NO: 87) 

X XX X X 

ATCAG CAC c AGCATc cG cGG c AAGGTGCAG (SEQ ID NO: 88) 

M8 (6284-6316) 34-mer 

GAATATGCATTTTTTTATAAACTTGATATAATA (SEQ ID NO: 89) 
X X X X X X 
10 GAATATGCcTTcTTcTAcAAgCTgGATATAATA (SEQ ID NO: 90) 

M9 (6317-6343) (28-mer) 

CCAATAGATAATGATACTACCAGCTAT (SEQ ID NO: 91) 

X XXX 
CCAATAGcTAAgGAcACcACCAGCTAT (SEQ ID NO: 92) 

15 M10 (6425-6469) (46-mer) 

GCCCCGGCTGGTTTTGCGATTCTAAAATGTAATAATAAGACGTTC (SEQ ID NO: 
93) 

XXX XXXXXX 

GCCCCGGCcGGcTTcGCGATcCTgAAgTGcAAcAAcAAGACGTTC (SEQ ID NO: 
94) 

Mil (6542-6583) (42-mer) 
20 CAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGGTAGTA (SEQ ID NO: 95) 

XXX xxxxx 

CAACTGCTGcTgAAcGGCAGcCTgGCcGAgGAgGAGGTAGTA (SEQ ID NO: 96) 
M12 (6590-6624) (35-mer) 

TCTGTCAATTTCACGGACAATGCTAAAACCATAAT (SEQ ID NO: 97) 
X X XXX 

25 TCTGCCAAcTTCACcGACAAcGCcAAgACCATAAT (SEQ ID NO: 98) 

M13 (6632-6663) (32-mer) 

CTGAACACATCTGTAGAAATTAATTGTACAAG (SEQ ID NO: 99) 

XXXXXX 
CTGAACCAgTCcGTgGAgATcAAcTGTACAAG (SEQ ID NO: 100) 

M14 (6667-6697) (31-mer) 
30 CAACAACAATACAAGAAAAAGAATCCGTATC (SEQ ID NO: 101) 
X X X XX X 
CAACAACAAcACcGGcAAgcGcATCCGTATC (SEQ ID NO: 102) 
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M15 (6806-6852) (47-mer) 

GCTAGCAAATTAAGAGAACAATTTGGAAATAATAAAACAATAATCTT (SEQ ID 
NO: 103) 

xxxxxxxxxxxxx 

GCTAGCAAgcTgcGcGAgCAgTAcGGgAAcAAcAAgACcATAATCTT (SEQ ID 
NO: 104) 

5 M16 (nt 6917-6961) (45-mer) 

TTCTACTGTAATTCAACACAACTGTTTAATAGTACTTGGTTTAAT (SEQ ID NO: 
105) 

xxxxx xx xx 

TTCTACTGgAAcTCcACcCAgCTGTTcAAcAGcACcTGGTTTAAT (SEQ ID NO: 
106) 

M17 (nt 7006-7048) (43-mer) 
10 CACAATCACCCTCCCATGCAGAATAAAACAAATTATAAACATG (SEQ ID NO: 
107) 

xxx xxxxxx 

CACAATCACcCTgCCcTGCcGcATcAAgCAgATcATAAACATG (SEQ ID NO: 
108) 

M18 (nt 7084-7129) (46-mer) 
15 CATCAGTGGACAAATTAGATGTTCATCAAATATTACAGGGCTGCTA (SEQ ID NO: 
109) 

xxxxxxxxxxx 

CATCAGCGGcCAgATccGcTGcTCcTCcAAcATcACcGGGCTGCTA (SEQ ID NO: 
' 110) 

M19 (nt 7195-7252) (58-mer) 

GAGGGACAATTGGAGAAGTGAATTATATAAATATAAAGTAGTAAAAATTGAACCATTA 
20 (SEQ ID NO: 111) 

x xxxxxxxxxxxxx 

GAGGGACAAcTGGAGgAGcGAgcTgTAcAAgTAcAAgGTgGTgAAgATcGAACCATTA 
(SEQ ID NO: 112) 

M20 (nt 7594-7633) (40-mer) 

GCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGAACAG (SEQ ID NO: 113) 
25 XXX XXXX 

GCCTTGGAAcGCcAGcTGGAGcAAcAAgTCcCTGGAACAG (SEQ ID NO: 114) 

M21 (nt 7658-7689) (32-mer) 

GAGTGGGACAGAGAAATTAACAATTACACAAG (SEQ ID NO: 115) 

XXXX X 
GAGTGGGACcGcGAgATcAACAAcTACACAAG (SEQ ID NO: 116) 

30 M22 (nt 7694-7741) (48-mer) 

ATACACTCCTTAATTGAAGAATCGCAAAACCAGCAAGAAAAGAATGAA (SEQ ID 
NO: 117) 

XXXXXXX XXX 
ATACACTCCcTgATcGAgGAgTCcCAgAACCAgCAgGAgAAGAATGAA (SEQ ID 
NO: 118) 
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* M23 (nt 7954-7993) (40-mer) 

CAGGCCCGAAGGAATAGAAGAAGAAGGTGGAGAGAGAGAC (SEQ ID NO: 119) 

xxxxxxxx 

CAGGCCCGAgGG c ATcG AgGAgGAgGG cGG cGAGAGAGAC (SEQ ID NO: 120) 
M24 (nt 8072-8121) (50-mer) 

TACCACCGCTTGAGAGACTTACTCTTGATTGTAACGAGGATTGTGGAACT (SEQ ID 
5 NO: 121) 

XXXXXX XX X 
TACCACCGCcTGcGcGACcTgCTCcTGATcGTgACGAGGATcGTGGAACT (SEQ ID 
NO: 122) 

M25 (nt 8136-8179) (44-mer) 

GGTGGGAAGCCCTCAAATATTGGTGGAATCTCCTACAGTATTGG (SEQ ID NO: 
123) 

10 X XX XX 

GGTGGGAgGCCCTCAAgTAcTGGTGGAAcCTCCTcCAGTATTGG (SEQ ID NO: 
124) 

M26 (nt 8180-8219) (40-mer) 

AGTCAGGAACTAAAGAATAGTGCTGTTAGCTTGCTCAATG (SEQ ID NO: 125) 
XX XXXXXX 
15 AGTCAGGAgCTgAAGAAcAGcGCcGTgAaCcTGCTCAATG (SEQ ID NO: 126) 



Comments : 

Although the vast majority of oligonucleotides 
follow the HXB2 sequence, some exceptions are noted: 
20 in oligo M15, nt 6807 follows the pNL43 

sequence. (Specifically, nt 6807 is C in NL43 but A in 
HBX2.) Oligo M26 has the nucleotide sequence derived from 
pNL43. 

25 EXAMPLE 5 

USE OF OR P37M1-10D OR P55M1-13P0 IN 
IMMUNOPROPHYLAXIS OR IMMUNOTHERAPY 

In postnatal gene therapy, new genetic 
information has been introduced into tissues by indirect 
means such as removing target cells from the body, 
infecting them with viral vectors carrying the new genetic 
information, and then reimplanting them into the body; or 
by direct means such as encapsulating formulations of DNA 
in liposomes; entrapping DNA in proteoliposomes containing 
viral envelope receptor proteins; calcium phosphate co- 
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precipitating DNA; and coupling DNA to a polylysine- 
glycoprotein carrier complex. In addition, in vivo 
infectivity of cloned viral DNA .sequences after direct 
intrahepatic injection with or without formation of 
calcium phosphate coprecipitates has also been described, 
mRNA sequences containing elements that enhance stability 
have also been shown to be efficiently translated in 
Xenopus laevis embryos, with the use of cationic lipid 
vesicles. See , e.g., J. A. Wolff, et al . , Science 
247:1465-1468 (1990) and references cited therein. 

Recently, it has also been shown that injection 
of pure RNA or DNA directly into skeletal muscle results 
in significant expression of genes within the muscle 
cells. J. A. Wolff, et al . , Science 247:1465-1468 (1990). 
Forcing RNA or DNA introduced into muscle cells by other 
means such as by particle-acceleration (N. -S. Yang, et 
al. Proc. N atl. Acad. Sci. USA 87:9568-9572 (1990); S.R. 
Williams et al . , Proc. Natl. Acad. Sci. USA 88:2726-2730 
(1991)) or by viral transduction should also allow the DNA 
or RNA to be stably maintained and expressed. In the 
experiments reported in Wolff et al., RNA or DNA vectors 
were used to express reporter genes in mouse skeletal 
muscle cells, specifically cells of the quadriceps 
muscles. Protein expression was readily detected and no 
special delivery system was required for these effects. 
Polynucleotide expression was also obtained when the 
composition and volume of the injection fluid and the 
method of injection were modified from the described 
protocol. For example, reporter enzyme activity was 
reported to have been observed with 10 to 100 fil of 
hypotonic, isotonic, and hypertonic sucrose solutions, 
Opti-MEM, or sucrose solutions containing 2mM CaCl 2 and 
also to have been observed when the 10- to 100- pi 
injections were performed over 20 min. with a pump instead 
of within 1 min. 

Enzymatic activity from the protein encoded by 
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° the reporter gene was also detected in abdominal muscle 
injected with the RNA or DNA vectors, indicating that 
other muscles can take up and express polynucleotide's. 
Low amounts of reporter enzyme were also detected in .other 
tissues (liver, spleen, skin, lung, brain and blood) 
5 injected with the RNA and DNA vectors. Intramuscularly 
injected plasmid DNA has also been demonstrated to be 
stably expressed in non-human primate muscle. S. Jiao et 
al., Hum. Gene Therapy 3:21-33 (1992). 

It has been proposed that the direct transfer of 

10 genes into human muscle in situ may have several potential 
clinical applications. Muscle is potentially a suitable 
tissue for the heterologous expression of a transgene that 
would modify disease states in which muscle is not 
primarily involved, in addition to those in which it is. 

15 For example, muscle tissue could be used for the 

heterologous expression of proteins that can immunize, be 
secreted in the blood, or clear a circulating toxic 
metabolite. The use of RNA and a tissue that can be 
repetitively accessed might be useful for a reversible 

20 type of gene transfer, administered much like conventional 
pharmaceutical treatments. See J. A. Wolff, et al., 
Science 247:1465-1468 (1990) and S. Jiao et al . , Hum . Gene 
Therapy 3:21-33 (1992). 

It had been proposed by J. A. Wolff et al., 

25 supra , that the intracellular expression of genes encoding 
antigens might provide alternative approaches to vaccine 
development. This hypothesis has been supported by a 
recent report that plasmid DNA encoding influenza A 
nucleoprotein injected into the quadriceps of BALB/c mice 

30 resulted in the generation of influenza A nucleoprotein- 
specific cytotoxic T lymphocytes (CTLs) and protection 
from a subsequent challenge with a heterologous strain of 
influenza A virus, as measured by decreased viral lung 
titers, inhibition of mass loss, and increased survival. 

35 j. B. Ulmer et al., Science 259:1745-1749 (1993). 
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° Therefore, it appears that the direct injection 

of RNA or DNA vectors encoding the viral antigen can be 
used for endogenous expression of the antigen to generate 
the viral antigen for presentation to the immune system, 
without the need for self -replicating agents or adjuvants, 

5 resulting in the generation of antigen- specific CTLs and 
protection from a subsequent challenge with a homologous 
or heterologous strain of virus. 



recognizing epitopes derived from conserved internal viral 
10 proteins and are thought to be important in the immune 

response against viruses. By recognition of epitopes from 
conserved viral proteins, CTLs may provide cross -strain 
protection. CTLs specific for conserved viral antigens 
can respond to different strains of virus, in contrast to 
15 antibodies, which are generally strain-specific. 



the viral antigen has the advantage of being without some 
of the limitations of direct peptide delivery or viral 
vectors. See J. A. Ulmer et al., supra . and the 

20 discussions and references therein) . Furthermore, the 

generation of high- titer antibodies to expressed proteins 
after injection of DNA indicates that this may be a facile 
and effective means of making antibody-based vaccines 
targeted towards conserved or non- conserved antigens, 

25 either separately or in combination with CTL vaccines 

targeted towards conserved antigens. These may also be 
used with traditional peptide vaccines, for the generation 
of combination vaccines. Furthermore, because protein 
expression is maintained after DNA injection, the 

30 persistence of B and T cell memory may be enhanced, 

thereby engendering long-lived humoral and cell -mediated 
immunity. 



CTLs in both mice and humans are capable of 



Thus, direct injection of RNA or DNA encoding 
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O 

1. Vectors for the immunoprophylaxis or 

immunotherapy agains t HIV-1 

The mutated gag genomic sequences in vectors . 
p37Ml-10D or p55Ml-13P0 (Fig. 6) will be inserted in 
5 expression vectors using a strong constitutive promoter 
such as CMV or RSV, or an inducible promoter such as 
HIV-1. 

The vector will be introduced into animals or 
humans in a pharmaceutically acceptable carrier using one 

10 of several techniques such as injection of DNA directly 

into human tissues; electroporation or transfection of the 
DNA into primary human cells in culture (ex vivo ) , 
selection of cells for desired properties and 
reintroduction of such cells into the body, (said 

15 selection can be for the successful homologous 

recombination of the incoming DNA to an appropriate 
preselected genomic region) ; generation of infectious 
particles containing the gag gene, infection of cells ex 
vivo and reintroduction of such cells into the body; or 

20 direct infection by said particles in vivo. 

Substantial levels of protein will be produced 
leading to an efficient stimulation of the immune system. 

In another embodiment of the invention, the 
described constructs will be modified to express mutated 

25 gag proteins that are unable to participate in virus 

particle formation. It is expected that such gag proteins 
will stimulate the immune system to the same extent as the 
wild- type gag protein, but be unable to contribute to 
increased HIV-1 production. This modification should 

30 result in safer vectors for immunotherapy and 
immunophrophylaxis . 
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EXAMPLE 6 

INHIBITION OF HIV-1 EXPRESSION USING TRANSDOMINANT 
(TD) - TD-GAG-TD REV OR TD GAG - PRO - TD REV GENES 

Direct injection of DNA or use of vectors other 
than retroviral vectors will allow the constitutive high 
level of trans -dominant gag (TDgag) in cells. In 
addition, the approach taken by B.K. Felber et al., 
Science 239:184-187 (1988) will allow the generation of 
retroviral vectors, e.g. mouse-derived retroviral vectors, 
encoding HIV-1 TDgag, which will not interfere with the 
infection of human cells by the retroviral vectors. In 
the approach of Felber, et al., supra . it was shown that 
fragments of the HIV-1 LTR containing the promoter and 
part of the polyA signal can be incorporated without 
detrimental effects within mouse retroviral vectors and 
remain transcriptionally silent. The presence of Tat 
protein stimulated transcription from the HIV-1 LTR and 
resulted in the high level expression of genes linked to 
the HIV-1 LTR. 

The generation of hybrid TDgag -TDRev or TDgag - 
pro-TDRev genes and the introduction of expression vectors 
in human cells will allow the efficient production of two 
proteins that will inhibit HIV-1 expression. The 
incorporation of two TD proteins in the same vector is 
expected to amplify the effects of each one on viral 
replication. The use of the HIV-1 promoter in a matter 
similar to one described in B.K. Felber, et al. f supra . 
will allow high level gag and rev expression in infected 
cells. In the absence of infection, expression will be 
substantially lower. Alternatively, the use of other 
strong promoters will allow the constitutive expression of 
such proteins. This approach could be highly beneficial, 
because of the production of a highly immunogenic gag, 
which is not able to participate in th production of 
infectious virus, but which, in fact, antagonizes such 
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production. This can be used as an efficient 
immuniprophylactic or immunotherapeutic approach against 
AIDS. 

Examples of trans -dominant mutants are described 
in Trono et al.. Cell 59:112-120 (1989) . 

1. Generation of constructs encoding 
transdominant gag mutant proteins 



Gag mutant proteins that can act as trans - 
dominant mutants, as described, for example, in Trono et 
10 al . , supra , will be generated by modifying vector 

p3 7Ml-10D or p55Ml-13P0 to produce transdominant gag 
proteins at high constitutive levels. 

The transdominant gag protein will stimulate the 
immune system and will inhibit the production of 
15 infectious virus, but will not contribute to the 
production of infectious virus. 

The added safety of this approach makes it more 
acceptable for human application. 

20 Those skilled in the art will recognize that any 

gene encoding a mRNA containing an inhibitory/ instability 
sequence or sequences can be modified in accordance with 
the exemplified methods of this invention or their 
functional equivalents. 

25 Modifications of the above described modes for 

carrying out the invention that are obvious to those of 
skill in the fields of genetic engineering, protein 
chemistry, medicine, and related fields are intended to be 
within the scope of the following claims. 

30 Every reference cited hereinbefore is hereby 

incorporated by reference in its entirety. 
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